yarn-cluster模式提交Spark任務,如何關閉client進程?

問題:

最近現場反饋採用yarn-cluster方式提交spark application後,在提交節點機上依然會存在一個yarn的client進程不關閉,又由於spark application都是spark structured streaming程序(application常年累月的執行),最終導致spark application提交節點服務器資源被佔滿,當執行其他操作時,會出現以下錯誤:

複製代碼

[dx@my-linux-01 bin]$ yarn logs -applicationId application_15644802175503_0189
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c000000, 702021632, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 702021632 bytes to committing reserved memory.
# An error report file with more information is saved as:
# /home/dx/myProj/appApp/bin/hs_err_pid53561.log
[dx@my-linux-01 bin]$ 

複製代碼

現場對spark application提交節點進行分析發現佔用進程主要是(yarn client集成佔用):

複製代碼

[dx@my-linux-01 bin]$ top
PID     USER  PR  NI    VIRT     RES  SHR   S  %CPU   %MEM   TIME+    COMMAND
122236  dx    20  0  20.629g  1.347g  3520  S   0.3    2.1   7:02.42     java
122246  dx    20  0  20.629g  1.311g  3520  S   0.3    2.0   7:03.42     java
122236  dx    20  0  20.629g  1.288g  3520  S   0.3    2.2   7:05.83     java
122346  dx    20  0  20.629g  1.344g  3520  S   0.3    2.1   7:10.42     java
121246  dx    20  0  20.629g  1.343g  3520  S   0.3    2.3   7:01.42     java
122346  dx    20  0  20.629g  1.341g  3520  S   0.3    2.4   7:03.39     java
112246  dx    20  0  20.629g  1.344g  3520  S   0.3    2.0   7:02.42     java
............
112260  dx    20  0  20.629g  1.344g  3520  S   0.3    2.0   7:02.02     java
112260  dx    20  0  113116      200     0  S   0.0    0.0   0:00.00     sh
............

複製代碼

Yarn提交Spark任務分析:

yarn方式提交spark application包含兩種:

1)yarn-client(spark-submit --master yarn --deploy-mode client ...):

這種方式spark提交application任務之後,driver運行在提交服務器節點,且driver運行yarn的client進程中,因此如果關閉了提交服務器節點上client進程會導致driver被關閉,進而導致application被關閉。

2)yarn-cluster(spark-submit --master yarn --deploy-mode cluster):

這種方式spark提交application任務之後,driver運行yarn分配container內,container內分配一個AM(Application Master)進程,SparkContext(driver)運行在該AM內,在yarn提交時,在提交節點上也會啓動一個yarn的client進程,默認yarn-client方式提交完application後會等待任務結束(failed,finished等),否則會一直運行。

解決方案:

yarn.client的參數

spark.yarn.submit.waitAppCompletion

如果設置這個參數爲true 的話,client將會一直運行並且報告application的狀態直到application退出(無論何種原因);

如果設置這個參數爲false的話,client的進程將會在application提交後退出。

在spark-submit 參數添加參數

./bin/spark-submit.sh \
--master yarn \
--deploy-mode cluster \
--conf spark.yarn.submit.waitAppCompletion=false
....

對應yarn.client類中代碼位置:

複製代碼

  /**
   * Submit an application to the ResourceManager.
   * If set spark.yarn.submit.waitAppCompletion to true, it will stay alive
   * reporting the application's status until the application has exited for any reason.
   * Otherwise, the client process will exit after submission.
   * If the application finishes with a failed, killed, or undefined status,
   * throw an appropriate SparkException.
   */
  def run(): Unit = {
    this.appId = submitApplication()
    if (!launcherBackend.isConnected() && fireAndForget) {
      val report = getApplicationReport(appId)
      val state = report.getYarnApplicationState
      logInfo(s"Application report for $appId (state: $state)")
      logInfo(formatReportDetails(report))
      if (state == YarnApplicationState.FAILED || state == YarnApplicationState.KILLED) {
        throw new SparkException(s"Application $appId finished with status: $state")
      }
    } else {
      val (yarnApplicationState, finalApplicationStatus) = monitorApplication(appId)
      if (yarnApplicationState == YarnApplicationState.FAILED ||
        finalApplicationStatus == FinalApplicationStatus.FAILED) {
        throw new SparkException(s"Application $appId finished with failed status")
      }
      if (yarnApplicationState == YarnApplicationState.KILLED ||
        finalApplicationStatus == FinalApplicationStatus.KILLED) {
        throw new SparkException(s"Application $appId is killed")
      }
      if (finalApplicationStatus == FinalApplicationStatus.UNDEFINED) {
        throw new SparkException(s"The final status of application $appId is undefined")
      }
    }
  }

複製代碼

發佈了127 篇原創文章 · 獲贊 76 · 訪問量 45萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章