第3課：SparkStreaming 透徹理解三板斧之三：解密SparkStreaming運行機制和架構進階之Job和容錯

本期內容：

解密Spark Streaming Job架構和運行機制
解密Spark Streaming容錯架構和運行機制

理解SparkStreaming的Job的整個架構和運行機制對於精通SparkStreaming是至關重要的。我們知道對於一般的Spark應用程序來說，是RDD的action操作觸發了Job的運行。那對於SparkStreaming來說，Job是怎麼樣運行的呢？我們在編寫SparkStreaming程序的時候，設置了BatchDuration，Job每隔BatchDuration時間會自動觸發，這個功能肯定是SparkStreaming框架提供了一個定時器，時間一到就將編寫的程序提交給Spark，並以Spark job的方式運行。

這裏面涉及到兩個Job的概念：

每個BatchInterval會產生一個具體的Job，其實這裏的Job不是Spark Core中所指的Job，它只是基於DStreamGraph而生成的RDD的DAG而已，從Java角度講，相當於Runnable接口實例，此時要想運行Job需要提交給JobScheduler，在JobScheduler中通過線程池的方式找到一個單獨的線程來提交Job到集羣運行（其實是在線程中基於RDD的Action觸發真正的作業的運行），爲什麼使用線程池呢？
a)，作業不斷生成，所以爲了提升效率，我們需要線程池；這和在Executor中通過線程池執行Task有異曲同工之妙；
b)，有可能設置了Job的FAIR公平調度的方式，這個時候也需要多線程的支持；
上面Job提交的Spark Job本身。單從這個時刻來看，此次的Job和Spark core中的Job沒有任何的區別。

下面我們看看job運行的過程：

1.首先實例化SparkConf，設置運行期參數。

val conf = new SparkConf().setAppName("UpdateStateByKeyDemo")

2.實例化StreamingContext，設置batchDuration時間間隔來控制Job生成的頻率並且創建Spark Streaming執行的入口。

val ssc = new StreamingContext(conf,Seconds(20))

3.在實例化StreamingContext的過程中，實例化JobScheduler和JobGenerator 。

StreamingContext.scala的第183行

private[streaming] val scheduler = new JobScheduler(this)

JobScheduler.scala的第50行

private val jobGenerator = new JobGenerator(this)

4.StreamingContext調用start方法。

def start(): Unit = synchronized {
  state match {
    case INITIALIZED =>
      startSite.set(DStream.getCreationSite())
      StreamingContext.ACTIVATION_LOCK.synchronized {
        StreamingContext.assertNoOtherContextIsActive()
        try {
          validate()

          // Start the streaming scheduler in a new thread, so that thread local properties
          // like call sites and job groups can be reset without affecting those of the
          // current thread.
          ThreadUtils.runInNewThread("streaming-start") {
            sparkContext.setCallSite(startSite.get)
            sparkContext.clearJobGroup()
            sparkContext.setLocalProperty(SparkContext.SPARK_JOB_INTERRUPT_ON_CANCEL, "false")
            scheduler.start()
          }
          state = StreamingContextState.ACTIVE
        } catch {
          case NonFatal(e) =>
            logError("Error starting the context, marking it as stopped", e)
            scheduler.stop(false)
            state = StreamingContextState.STOPPED
            throw e
        }
        StreamingContext.setActiveContext(this)
      }
      shutdownHookRef = ShutdownHookManager.addShutdownHook(
        StreamingContext.SHUTDOWN_HOOK_PRIORITY)(stopOnShutdown)
      // Registering Streaming Metrics at the start of the StreamingContext
      assert(env.metricsSystem != null)
      env.metricsSystem.registerSource(streamingSource)
      uiTab.foreach(_.attach())
      logInfo("StreamingContext started")
    case ACTIVE =>
      logWarning("StreamingContext has already been started")
    case STOPPED =>
      throw new IllegalStateException("StreamingContext has already been stopped")
  }
}

5.在StreamingContext.start()內部啓動JobScheduler的Start方法。

scheduler.start()

在JobScheduler.start()內部實例化EventLoop，並執行EventLoop.start()進行消息循環。

在JobScheduler.start()內部構造ReceiverTacker，並且調用JobGenerator和ReceiverTacker的start方法：

def start(): Unit = synchronized {
  if (eventLoop != null) return // scheduler has already been started

  logDebug("Starting JobScheduler")
  eventLoop = new EventLoop[JobSchedulerEvent]("JobScheduler") {
    override protected def onReceive(event: JobSchedulerEvent): Unit = processEvent(event)

    override protected def onError(e: Throwable): Unit = reportError("Error in job scheduler", e)
  }
  eventLoop.start()

  // attach rate controllers of input streams to receive batch completion updates
  for {
    inputDStream <- ssc.graph.getInputStreams
    rateController <- inputDStream.rateController
  } ssc.addStreamingListener(rateController)

  listenerBus.start(ssc.sparkContext)
  receiverTracker = new ReceiverTracker(ssc)
  inputInfoTracker = new InputInfoTracker(ssc)
  receiverTracker.start()
  jobGenerator.start()
  logInfo("Started JobScheduler")
}

6.JobGenerator啓動後會不斷的根據batchDuration生成一個個的Job

/** Generate jobs and perform checkpoint for the given `time`.  */
private def generateJobs(time: Time) {
  // Set the SparkEnv in this thread, so that job generation code can access the environment
  // Example: BlockRDDs are created in this thread, and it needs to access BlockManager
  // Update: This is probably redundant after threadlocal stuff in SparkEnv has been removed.
  SparkEnv.set(ssc.env)
  Try {
    jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch
    graph.generateJobs(time) // generate jobs using allocated block
  } match {
    case Success(jobs) =>
      val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time)
      jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos))
    case Failure(e) =>
      jobScheduler.reportError("Error generating jobs for time " + time, e)
  }
  eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false))
}

7.ReceiverTracker啓動後首先在Spark Cluster中啓動Receiver（其實是在Executor中先啓動ReceiverSupervisor），在Receiver收到數據後會通過ReceiverSupervisor存儲到Executor並且把數據的Metadata信息發送給Driver中的ReceiverTracker，在ReceiverTracker內部會通過ReceivedBlockTracker來管理接受到的元數據信息。

/** Start the endpoint and receiver execution thread. */
def start(): Unit = synchronized {
  if (isTrackerStarted) {
    throw new SparkException("ReceiverTracker already started")
  }

  if (!receiverInputStreams.isEmpty) {
    endpoint = ssc.env.rpcEnv.setupEndpoint(
      "ReceiverTracker", new ReceiverTrackerEndpoint(ssc.env.rpcEnv))
    if (!skipReceiverLaunch) launchReceivers()
    logInfo("ReceiverTracker started")
    trackerState = Started
  }
}

二. Spark Streaming容錯機制：

　我們知道DStream與RDD的關係就是隨着時間流逝不斷的產生RDD，對DStream的操作就是在固定時間上操作RDD。所以從某種意義上而言，Spark Streaming的基於DStream的容錯機制，實際上就是劃分到每一次形成的RDD的容錯機制，這也是Spark Streaming的高明之處。

Spark Streaming的容錯要考慮兩個方面：

Driver運行失敗時的恢復
使用Checkpoint，記錄Driver運行時的狀態，失敗後可以讀取Checkpoint並恢復Driver狀態。
具體的每次Job運行失敗時的恢復
要考慮到Receiver的失敗恢復，也要考慮到RDD計算失敗的恢復。Receiver可以採用寫wal日誌的方式。RDD的容錯是spark core天生提供的，基於RDD的特性，它的容錯機制主要就是兩種：

　　01. 基於checkpoint；

在stage之間，是寬依賴，產生了shuffle操作，lineage鏈條過於複雜和冗長，這時候就需要做checkpoint。

　　02. 基於lineage（血統）的容錯：

　　一般而言，spark選擇血統容錯，因爲對於大規模的數據集，做檢查點的成本很高。考慮到RDD的依賴關係，每個stage內部都是窄依賴，此時一般基於lineage容錯，方便高效。

　　總結： stage內部做lineage，stage之間做checkpoint。

備註：

1、DT大數據夢工廠微信公衆號DT_Spark
2、IMF晚8點大數據實戰YY直播頻道號：68917580
3、新浪微博: http://www.weibo.com/ilovepains

第3課：SparkStreaming 透徹理解三板斧之三：解密SparkStreaming運行機制和架構進階之Job和容錯

推薦2款開源、美觀的WinForm UI控件庫

NET9 AspnetCore將整合OpenAPI的文檔生成功能而無需三方庫

在Linux下管理MySQL的大小寫敏感性

第35講：List的map、flatMap、foreach、filter操作代碼實戰

第42講：Scala中泛型類、泛型函數、泛型在Spark中的廣泛應用

第40講：Set、Map、TreeSet、TreeMap操作代碼實戰

第53課：Hive 第一課：Hive的價值、Hive的架構設計簡介

第36講：List的partition、find、takeWhile、dropWhile、span、forall、exsists操作代碼實戰

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結