JobGenerator和ReceiverTracker的類對象是JobSchedule的類成員。從SparkStreaming應用程序valssc=StreamingContext(conf)入口開始,直到ssc.start()啓動了SparkStreaming框架的執行後,一直到JobSchedule調用start(),schedule.start()調用了ReceiverTracker和JobGenerator類對象:
def start(): Unit = synchronized {
if (eventLoop != null) return // scheduler has already been started
logDebug("Starting JobScheduler")
eventLoop = new EventLoop[JobSchedulerEvent]("JobScheduler") {
override protected def onReceive(event: JobSchedulerEvent): Unit = processEvent(event)
override protected def onError(e: Throwable): Unit = reportError("Error in job scheduler", e)
}
eventLoop.start()
// attach rate controllers of input streams to receive batch completion updates
for {
inputDStream <- ssc.graph.getInputStreams
rateController <- inputDStream.rateController
} ssc.addStreamingListener(rateController)
listenerBus.start(ssc.sparkContext)
receiverTracker = new ReceiverTracker(ssc)
inputInfoTracker = new InputInfoTracker(ssc)
receiverTracker.start()
jobGenerator.start()
logInfo("Started JobScheduler")
}
JobScheduler有兩個非常重要的成員:
· JobGenerator
· ReceiverTracker
JobScheduler 將每個batch的RDD DAG的具體生成工作委託給JobGenerator,將源頭數據輸入的記錄工作委託給ReceiverTracker 。
在JobGenerator中有兩個至關重要的成員就是RecurringTimer和EventLoop;RecurringTimer它控制了job的觸發。每到batchInterval時間,就往EventLoop的隊列中放入一個消息。而EventLoop則不斷的查看消息隊列,一旦有消息就處理。JobGenerator會根據BatchDuration時間間隔,隨着時間的推移,會不斷的產生作業,驅使checkpoint操作和清理之前DStream的數據。
先看下JobGenerator的start方法,checkpoint的初始化操作,實例化並啓動消息循環體EventLoop,開啓定時生成Job的定時器:
/** Start generation of jobs */
def start(): Unit = synchronized {
if (eventLoop != null) return // generator has already been started
// Call checkpointWriter here to initialize it before eventLoop uses it to avoid a deadlock.
// See SPARK-10125
checkpointWriter
eventLoop = new EventLoop[JobGeneratorEvent]("JobGenerator") {
override protected def onReceive(event: JobGeneratorEvent): Unit = processEvent(event)
override protected def onError(e: Throwable): Unit = {
jobScheduler.reportError("Error in job generator", e)
}
}
eventLoop.start()
if (ssc.isCheckpointPresent) {
restart()
} else {
startFirstTime()
}
}
EvenLoop類中有存儲消息的LinkedBlockingDeque類對象和後臺線程,後臺線程從隊列中獲取消息,然後調用onReceive方法對該消息進行處理,這裏的onReceive方法即匿名內部類中重寫onReceive方法的processEvent方法。
processEvent方法是對消息類型進行模式匹配,然後路由到對應處理該消息的方法中。消息的處理一般是發給另外一個線程來處理的,消息循環器不處理耗時的業務邏輯:
/** Processes all events */
private def processEvent(event: JobGeneratorEvent) {
logDebug("Got event " + event)
event match {
case GenerateJobs(time) => generateJobs(time)
case ClearMetadata(time) => clearMetadata(time)
case DoCheckpoint(time, clearCheckpointDataLater) =>
doCheckpoint(time, clearCheckpointDataLater)
case ClearCheckpointData(time) => clearCheckpointData(time)
}
}
GenerateJobs在獲取到數據後調用DStreamGraph的generateJobs方法來生成Job:
def generateJobs(time: Time): Seq[Job] = {
logDebug("Generating jobs for time " + time)
val jobs = this.synchronized {
outputStreams.flatMap { outputStream =>
val jobOption = outputStream.generateJob(time)
jobOption.foreach(_.setCallSite(outputStream.creationSite))
jobOption
}
}
logDebug("Generated " + jobs.length + " jobs for time " + time)
jobs
}
streamIdToInputInfos是基於時間的數據,獲得了這個數據後,jobScheduler.submitJobSet這個方法就產生了jobset,以這個JobSet交給JobSchedule進行調度執行Job。
generateJobs方法中outputStreams是整個DStream中的最後一個DStream。這裏outputStream.generateJob(time)類似於RDD中從後往前推:
def generateJobs(time: Time): Seq[Job] = {
logDebug("Generating jobs for time " + time)
val jobs = this.synchronized {
outputStreams.flatMap { outputStream =>
val jobOption = outputStream.generateJob(time)
jobOption.foreach(_.setCallSite(outputStream.creationSite))
jobOption
}
}
logDebug("Generated " + jobs.length + " jobs for time " + time)
jobs
}
generateJob方法中jobFunc 封裝了context.sparkContext.runJob(rdd, emptyFunc):
/**
* Generate a SparkStreaming job for the given time. This is an internal method that
* should not be called directly. This default implementation creates a job
* that materializes the corresponding RDD. Subclasses of DStream may override this
* to generate their own jobs.
*/
private[streaming] def generateJob(time: Time): Option[Job] = {
getOrCompute(time) match {
case Some(rdd) => {
val jobFunc = () => {
val emptyFunc = { (iterator: Iterator[T]) => {} }
context.sparkContext.runJob(rdd, emptyFunc)
}
Some(new Job(time, jobFunc))
}
case None => None
}
}
Job對象,方法run會導致傳入的func被調用:
private[streaming]
class Job(val time: Time, func: () => _) {
private var _id: String = _
private var _outputOpId: Int = _
private var isSet = false
private var _result: Try[_] = null
private var _callSite: CallSite = null
private var _startTime: Option[Long] = None
private var _endTime: Option[Long] = None
def run() {
_result = Try(func())
}
getOrCompute方法,先根據傳入的時間在HashMap中查找下RDD是否存在,如果不存在則調用compute方法計算獲取RDD,再根據storageLevel 是否需要persist,是否到了checkpoint時間點進行checkpoint操作,最後把該RDD放入到HashMap中:
private[streaming] final def getOrCompute(time: Time): Option[RDD[T]] = {
// If RDD was already generated, then retrieve it from HashMap,
// or else compute the RDD
generatedRDDs.get(time).orElse {
// Compute the RDD if time is valid (e.g. correct time in a sliding window)
// of RDD generation, else generate nothing.
if (isTimeValid(time)) {
val rddOption = createRDDWithLocalProperties(time, displayInnerRDDOps = false) {
// Disable checks for existing output directories in jobs launched by the streaming
// scheduler, since we may need to write output to an existing directory during checkpoint
// recovery; see SPARK-4835 for more details. We need to have this call here because
// compute() might cause Spark jobs to be launched.
PairRDDFunctions.disableOutputSpecValidation.withValue(true) {
compute(time)
}
}
rddOption.foreach { case newRDD =>
// Register the generated RDD for caching and checkpointing
if (storageLevel != StorageLevel.NONE) {
newRDD.persist(storageLevel)
logDebug(s"Persisting RDD ${newRDD.id} for time $time to $storageLevel")
}
if (checkpointDuration != null && (time - zeroTime).isMultipleOf(checkpointDuration)) {
newRDD.checkpoint()
logInfo(s"Marking RDD ${newRDD.id} for time $time for checkpointing")
}
generatedRDDs.put(time, newRDD)
}
rddOption
} else {
None
}
}
}
再次回到JobGenerator類中,看下start方法中在消息循環體啓動後,先判斷之前是否進行checkpoint操作,如果是從checkpoint目錄中讀取然後再調用restart重啓JobGenerator,如果是第一次則調用startFirstTime方法:
JobGenerator類中的startFirstTime方法,啓動定時生成Job的Timer:
timer對象爲RecurringTimer,其start方法內部啓動一個線程,在線程中不斷調用triggerActionForNextInterval方法:
private[streaming]
class RecurringTimer(clock: Clock, period: Long, callback: (Long) => Unit, name: String)
extends Logging {
private val thread = new Thread("RecurringTimer - " + name) {
setDaemon(true)
override def run() { loop }
}
@volatile private var prevTime = -1L
@volatile private var nextTime = -1L
@volatile private var stopped = false
/**
* Get the time when this timer will fire if it is started right now.
* The time will be a multiple of this timer's period and more than
* current system time.
*/
def getStartTime(): Long = {
(math.floor(clock.getTimeMillis().toDouble / period) + 1).toLong * period
}
/**
* Get the time when the timer will fire if it is restarted right now.
* This time depends on when the timer was started the first time, and was stopped
* for whatever reason. The time must be a multiple of this timer's period and
* more than current time.
*/
def getRestartTime(originalStartTime: Long): Long = {
val gap = clock.getTimeMillis() - originalStartTime
(math.floor(gap.toDouble / period).toLong + 1) * period + originalStartTime
}
/**
* Start at the given start time.
*/
def start(startTime: Long): Long = synchronized {
nextTime = startTime
thread.start()
logInfo("Started timer for " + name + " at time " + nextTime)
nextTime
}
/**
* Start at the earliest time it can start based on the period.
*/
def start(): Long = {
start(getStartTime())
}
/**
* Stop the timer, and return the last time the callback was made.
*
* @param interruptTimer True will interrupt the callback if it is in progress (not guaranteed to
* give correct time in this case). False guarantees that there will be at
* least one callback after `stop` has been called.
*/
def stop(interruptTimer: Boolean): Long = synchronized {
if (!stopped) {
stopped = true
if (interruptTimer) {
thread.interrupt()
}
thread.join()
logInfo("Stopped timer for " + name + " after time " + prevTime)
}
prevTime
}
private def triggerActionForNextInterval(): Unit = {
clock.waitTillTime(nextTime)
callback(nextTime)
prevTime = nextTime
nextTime += period
logDebug("Callback for " + name + " called at time " + prevTime)
}
/**
* Repeatedly call the callback every interval.
*/
private def loop() {
try {
while (!stopped) {
triggerActionForNextInterval()
}
triggerActionForNextInterval()
} catch {
case e: InterruptedException =>
}
}
}
triggerActionForNextInterval方法,等待BatchDuration後回調callback這個方法,這裏的callback方法是構造RecurringTimer對象時傳入的方法,即longTime => eventLoop.post(GenerateJobs(new Time(longTime))),不斷向消息循環體發送GenerateJobs消息。
再次聚焦generateJobs這個方法生成Job的步驟:
第一步:獲取當前時間段內的數據。
第二步:生成Job,RDD之間的依賴關係。
第三步:獲取生成Job對應的StreamId的信息。
第四步:封裝成JobSet交給JobScheduler。
第五步:進行checkpoint操作。
其中submitJobSet方法,只是把JobSet放到ConcurrentHashMap中,把Job封裝爲JobHandler提交到jobExecutor線程池中:
JobHandler對象爲實現Runnable 接口,job的run方法導致了func的調用,即基於DStream的業務邏輯:
private class JobHandler(job: Job) extends Runnable with Logging {
import JobScheduler._
def run() {
try {
val formattedTime = UIUtils.formatBatchTime(
job.time.milliseconds, ssc.graph.batchDuration.milliseconds, showYYYYMMSS = false)
val batchUrl = s"/streaming/batch/?id=${job.time.milliseconds}"
val batchLinkText = s"[output operation ${job.outputOpId}, batch time ${formattedTime}]"
ssc.sc.setJobDescription(
s"""Streaming job from <a href="$batchUrl">$batchLinkText</a>""")
ssc.sc.setLocalProperty(BATCH_TIME_PROPERTY_KEY, job.time.milliseconds.toString)
ssc.sc.setLocalProperty(OUTPUT_OP_ID_PROPERTY_KEY, job.outputOpId.toString)
// We need to assign `eventLoop` to a temp variable. Otherwise, because
// `JobScheduler.stop(false)` may set `eventLoop` to null when this method is running, then
// it's possible that when `post` is called, `eventLoop` happens to null.
var _eventLoop = eventLoop
if (_eventLoop != null) {
_eventLoop.post(JobStarted(job, clock.getTimeMillis()))
// Disable checks for existing output directories in jobs launched by the streaming
// scheduler, since we may need to write output to an existing directory during checkpoint
// recovery; see SPARK-4835 for more details.
PairRDDFunctions.disableOutputSpecValidation.withValue(true) {
job.run()
}
_eventLoop = eventLoop
if (_eventLoop != null) {
_eventLoop.post(JobCompleted(job, clock.getTimeMillis()))
}
} else {
// JobScheduler has been stopped.
}
} finally {
ssc.sc.setLocalProperty(JobScheduler.BATCH_TIME_PROPERTY_KEY, null)
ssc.sc.setLocalProperty(JobScheduler.OUTPUT_OP_ID_PROPERTY_KEY, null)
}
}
}