Quartz recovery 及misfired機制的源碼分析

quartz作爲成熟的任務調度系統對系統的異常及崩潰後處理機制有很好的設計,以保證整個調度過程是一個邏輯閉環,任何階段出現的問題都可以通過框架中的機制盡最大限度的彌補,並將系統的狀態引向正軌。

首先要明確的是:quartz如果在執行具體任務時,在任務執行過程中拋出異常,那麼不作任何處理,這是使用者程序本身的問題,不需要框架處理。

下面介紹quartz中的兩大類異常情況:

misfired 啞火(*注:筆者自己直譯)

fail-over 故障轉移

1.misfired 啞火

啞火顧名思義,就是quartz在應該觸發(fire)trigger的時候未能及時將其觸發,這將導致trigger的下次觸發時間落在當前時間之前,那麼按照正常的quartz調度流程,該trigger就再也沒有機會被調度了。由於一個調度器實例在每次調度的過程中都會有一定的睡眠時間,存在在一段時間內所有調度器實例都在睡眠,而無人觸發調度的潛在可能。於是調度器需要每隔一段時間(15s~60s)查看一次各trigger的nextfiretime,檢查出否有tirgger的下一次觸發落在了當前時間之前足夠長的時間,在這裏系統設定了一個60s的域(misfireThreshold),當一個trigger的下一次觸發時間早於當前時間60s之外時,調度器判定該觸發器misfired,在發現有觸發器啞火之後啓動相應的流程回覆trigger至正常狀態。上述這些過程是在調度器初始化時與主調度線程類quartzSchedulerThread同時開啓的一個線程類MisfireHandler中進行的。

下面是quartz檢測misfired的邏輯:

protected RecoverMisfiredJobsResult doRecoverMisfires() throws JobPersistenceException {
    boolean transOwner = false;
    Connection conn = getNonManagedTXConnection();
    try {
        RecoverMisfiredJobsResult result = RecoverMisfiredJobsResult.NO_OP;
         
        // Before we make the potentially expensive call to acquire the
        // trigger lock, peek ahead to see if it is likely we would find
        // misfired triggers requiring recovery.
        //統計啞火 trigger的個數
        int misfireCount = (getDoubleCheckLockMisfireHandler()) ?
            getDelegate().countMisfiredTriggersInState(
                conn, STATE_WAITING, getMisfireTime()) :
            Integer.MAX_VALUE;
        //沒有啞火trigger,do nothing
        if (misfireCount == 0) {
            getLog().debug(
                "Found 0 triggers that missed their scheduled fire-time.");
        else {
            transOwner = getLockHandler().obtainLock(conn, LOCK_TRIGGER_ACCESS);
            //檢查到有啞火的tirgger,啓動recovery程序,處理啞火trigger
            result = recoverMisfiredJobs(conn, false);
        }
         
        commitConnection(conn);
        return result;
    catch (JobPersistenceException e) {
        rollbackConnection(conn);
        throw e;
    catch (SQLException e) {
        rollbackConnection(conn);
        throw new JobPersistenceException("Database error recovering from misfires.", e);
    catch (RuntimeException e) {
        rollbackConnection(conn);
        throw new JobPersistenceException("Unexpected runtime exception: "
                + e.getMessage(), e);
    finally {
        try {
            releaseLock(LOCK_TRIGGER_ACCESS, transOwner);
        finally {
            cleanupConnection(conn);
        }
    }
}

 

下面是quartz對misfire處理的關鍵代碼:

protected RecoverMisfiredJobsResult recoverMisfiredJobs(
        Connection conn, boolean recovering)
        throws JobPersistenceException, SQLException {
        // If recovering, we want to handle all of the misfired
        // triggers right away.
        int maxMisfiresToHandleAtATime =
            (recovering) ? -1 : getMaxMisfiresToHandleAtATime();
        //定義列表,用以存儲misfired triggers
        List<TriggerKey> misfiredTriggers = new LinkedList<TriggerKey>();
        long earliestNewTime = Long.MAX_VALUE;
        
        //傳入列表,引用,該方法將在列表中添加啞火的trigger對象
        boolean hasMoreMisfiredTriggers =
            getDelegate().hasMisfiredTriggersInState(
                conn, STATE_WAITING, getMisfireTime(),
                maxMisfiresToHandleAtATime, misfiredTriggers);
        if (hasMoreMisfiredTriggers) {
            getLog().info(
                "Handling the first " + misfiredTriggers.size() +
                " triggers that missed their scheduled fire-time.  " +
                "More misfired triggers remain to be processed.");
        else if (misfiredTriggers.size() > 0) {
            getLog().info(
                "Handling " + misfiredTriggers.size() +
                " trigger(s) that missed their scheduled fire-time.");
        else {
            getLog().debug(
                "Found 0 triggers that missed their scheduled fire-time.");
            return RecoverMisfiredJobsResult.NO_OP;
        }
        //迭代triggers列表
        for (TriggerKey triggerKey: misfiredTriggers) {
            //獲取trigger的詳細信息
            OperableTrigger trig =
                retrieveTrigger(conn, triggerKey);
            if (trig == null) {
                continue;
            }
            //根據特定的trigger類型與指定的處理策略處理對trigger的下一觸發時間做出設定,並持久化到數據庫。
            doUpdateOfMisfiredTrigger(conn, trig, false, STATE_WAITING, recovering);
            if(trig.getNextFireTime() != null && trig.getNextFireTime().getTime() < earliestNewTime)
                earliestNewTime = trig.getNextFireTime().getTime();
        }
        return new RecoverMisfiredJobsResult(
                hasMoreMisfiredTriggers, misfiredTriggers.size(), earliestNewTime);
    }

對trigger啞火處理的最關鍵一點在於針對不同策略對trigger的nextfiretime進行設定,這一過程對於不同的trigger類型有不同的策略供選擇。

下面是各種不同triigger對應的不同misfire策略(摘自網絡):
CronTrigger

withMisfireHandlingInstructionDoNothing
——不觸發立即執行
——等待下次Cron觸發頻率到達時刻開始按照Cron頻率依次執行

withMisfireHandlingInstructionIgnoreMisfires
——以錯過的第一個頻率時間立刻開始執行
——重做錯過的所有頻率週期後
——當下一次觸發頻率發生時間大於當前時間後,再按照正常的Cron頻率依次執行

withMisfireHandlingInstructionFireAndProceed(默認)
——以當前時間爲觸發頻率立刻觸發一次執行
——然後按照Cron頻率依次執行


SimpleTrigger

withMisfireHandlingInstructionFireNow
——以當前時間爲觸發頻率立即觸發執行
——執行至FinalTIme的剩餘週期次數
——以調度或恢復調度的時刻爲基準的週期頻率,FinalTime根據剩餘次數和當前時間計算得到
——調整後的FinalTime會略大於根據starttime計算的到的FinalTime值

withMisfireHandlingInstructionIgnoreMisfires
——以錯過的第一個頻率時間立刻開始執行
——重做錯過的所有頻率週期
——當下一次觸發頻率發生時間大於當前時間以後,按照Interval的依次執行剩下的頻率
——共執行RepeatCount+1次

withMisfireHandlingInstructionNextWithExistingCount
——不觸發立即執行
——等待下次觸發頻率週期時刻,執行至FinalTime的剩餘週期次數
——以startTime爲基準計算週期頻率,並得到FinalTime
——即使中間出現pause,resume以後保持FinalTime時間不變


withMisfireHandlingInstructionNowWithExistingCount(默認)
——以當前時間爲觸發頻率立即觸發執行
——執行至FinalTIme的剩餘週期次數
——以調度或恢復調度的時刻爲基準的週期頻率,FinalTime根據剩餘次數和當前時間計算得到
——調整後的FinalTime會略大於根據starttime計算的到的FinalTime值

withMisfireHandlingInstructionNextWithRemainingCount
——不觸發立即執行
——等待下次觸發頻率週期時刻,執行至FinalTime的剩餘週期次數
——以startTime爲基準計算週期頻率,並得到FinalTime
——即使中間出現pause,resume以後保持FinalTime時間不變

withMisfireHandlingInstructionNowWithRemainingCount
——以當前時間爲觸發頻率立即觸發執行
——執行至FinalTIme的剩餘週期次數
——以調度或恢復調度的時刻爲基準的週期頻率,FinalTime根據剩餘次數和當前時間計算得到

——調整後的FinalTime會略大於根據starttime計算的到的FinalTime值

MISFIRE_INSTRUCTION_RESCHEDULE_NOW_WITH_REMAINING_REPEAT_COUNT
——此指令導致trigger忘記原始設置的starttime和repeat-count
——觸發器的repeat-count將被設置爲剩餘的次數
——這樣會導致後面無法獲得原始設定的starttime和repeat-count值

 

2.fail-over 故障轉移

quartz考慮的另一個問題是運行時的系統崩潰,在集羣中如果有一個節點突然崩潰,那麼它所執行的任務會被首先發現其崩潰的節點接手,重新執行,換句話說就是把故障節點的工作轉移到其他節點上,簡稱故障轉移。recovery機制工作在集羣環境中,執行recovery工作的線程類叫做ClusterManager,該線程類同樣是在調度器初始化時就開啓運行了。這個線程類在運行期間每15s進行一次check in操作,所謂check in,就是在數據庫的QRTZ2_SCHEDULER_STATE表中更新該調度器對應的LAST_CHECKIN_TIME字段爲當前時間,並且查看其他調度器實例的該字段有沒有發生停止更新的情況,如果檢查到有調度器的check in time比當前時間要早約15s(視具體的執行預配置情況而定),那麼就判定該調度實例需要recover,隨後會啓動該調度器的recovery機制,獲取目標調度器實例正在觸發的trigger,並針對每一個trigger臨時添加一各對應的僅執行一次的simpletrigger。等到調度流程掃描trigger時,這些trigger會被觸發,這樣就成功的把這些未完整執行的調度以一種特殊trigger的形式納入了普通的調度流程中,只要調度流程在正常運行,這些被recover的trigger就會很快被觸發並執行。

下面的代碼是ClusterManager線程類的run方法,可以看到,該線程類不斷地在調用manage方法,該方法中包含了check in與recover的邏輯。

public void run() {
        while (!shutdown) {
            if (!shutdown) {
                long timeToSleep = getClusterCheckinInterval();
                long transpiredTime = (System.currentTimeMillis() - lastCheckin);
                timeToSleep = timeToSleep - transpiredTime;
                if (timeToSleep <= 0) {
                    timeToSleep = 100L;
                }
                if(numFails > 0) {
                    //每次循環會睡眠一個不小於DbRetryInterval(默認15s)的時間
                    timeToSleep = Math.max(getDbRetryInterval(), timeToSleep);
                }
                 
                try {
                    Thread.sleep(timeToSleep);
                catch (Exception ignore) {
                }
            }
            //調用manage方法,該方法內包含check in與recover的主要邏輯
            if (!shutdown && this.manage()) {
                signalSchedulingChangeImmediately(0L);
            }
        }//while !shutdown
    }

manage方法主要調用doCheckIn方法,該方法中承載的check in與recover的詳細邏輯:

protected boolean doCheckin() throws JobPersistenceException {
    boolean transOwner = false;
    boolean transStateOwner = false;
    boolean recovered = false;
    Connection conn = getNonManagedTXConnection();
    try {
        // Other than the first time, always checkin first to make sure there is
        // work to be done before we acquire the lock (since that is expensive,
        // and is almost never necessary).  This must be done in a separate
        // transaction to prevent a deadlock under recovery conditions.
        List<SchedulerStateRecord> failedRecords = null;
        //第一次check in時數據庫還沒有該調度器的數據,要做特殊處理,否則就首先調用clusterCheckIn方法,並提交操作,
        //clusterCheckIn方法中首先調用findFailedInstances方法,查找數據庫中有沒有需要recover的trigger,
        //然後刷新本調度器的check in time
        if (!firstCheckIn) {
            failedRecords = clusterCheckIn(conn);
            commitConnection(conn);
        }
         
        if (firstCheckIn || (failedRecords.size() > 0)) {
            getLockHandler().obtainLock(conn, LOCK_STATE_ACCESS);
            transStateOwner = true;
 
            // Now that we own the lock, make sure we still have work to do.
            // The first time through, we also need to make sure we update/create our state record
            //如果是第一次check in那麼執行clusterCheckIn,否則只執行findFailedInstances
            failedRecords = (firstCheckIn) ? clusterCheckIn(conn) : findFailedInstances(conn);
 
            if (failedRecords.size() > 0) {
                getLockHandler().obtainLock(conn, LOCK_TRIGGER_ACCESS);
                //getLockHandler().obtainLock(conn, LOCK_JOB_ACCESS);
                transOwner = true;
                //對需要recover的調度器實例啓動recover流程
                clusterRecover(conn, failedRecords);
                recovered = true;
            }
        }
         
        commitConnection(conn);
    catch (JobPersistenceException e) {
        rollbackConnection(conn);
        throw e;
    finally {
        try {
            releaseLock(LOCK_TRIGGER_ACCESS, transOwner);
        finally {
            try {
                releaseLock(LOCK_STATE_ACCESS, transStateOwner);
            finally {
                cleanupConnection(conn);
            }
        }
    }
    firstCheckIn = false;
    return recovered;
}

在代碼中對第一次check in的操作比較令人困惑,不管是不是第一次check in似乎都需要調用clusterCheckIn方法,而該方法內部又調用了findFailedInstances方法,見代碼:

protected List<SchedulerStateRecord> clusterCheckIn(Connection conn)
    throws JobPersistenceException {
    List<SchedulerStateRecord> failedInstances = findFailedInstances(conn);
     
    try {
        // FUTURE_TODO: handle self-failed-out
        // check in...
        lastCheckin = System.currentTimeMillis();
        if(getDelegate().updateSchedulerState(conn, getInstanceId(), lastCheckin) == 0) {
            getDelegate().insertSchedulerState(conn, getInstanceId(),
                    lastCheckin, getClusterCheckinInterval());
        }
         
    catch (Exception e) {
        throw new JobPersistenceException("Failure updating scheduler state when checking-in: "
                + e.getMessage(), e);
    }
    return failedInstances;
}

而findFailedInstances方法中有一個處理無主trigger的邏輯,無主trigger是說在QRTZ2_FIRED_TRIGGERS表中如果一條記錄的調度器id在QRTZ2_SCHEDULER_STATE表中找不到相應的記錄,那麼這條trigger觸發就成爲一個無主的記錄。這種記錄只能查詢到其調用者id,而無法知道QRTZ2_SCHEDULER_STATE表中可以查到的其他字段,需要系統做特殊處理(TODO:如何處理)。

上述的情況需要在五新的QRTZ2_SCHEDULER_STATE記錄插入時進行,所以在doCheckin中的安排僅僅是爲了讓調度器在自身數據要加入進QRTZ2_SCHEDULER_STATE表之前檢查一遍是否有這種無主數據,並做處理。

回到doCheckin方法中,得到需要recover的調度器實例列表,啓動recover流程。clusterRecover的代碼如下:

protected void clusterRecover(Connection conn, List<SchedulerStateRecord> failedInstances)
    throws JobPersistenceException {
    if (failedInstances.size() > 0) {
        long recoverIds = System.currentTimeMillis();
        logWarnIfNonZero(failedInstances.size(),
                "ClusterManager: detected " + failedInstances.size()
                        " failed or restarted instances.");
        try {
            //迭代需要recover的SchedulerStateRecord列表
            for (SchedulerStateRecord rec : failedInstances) {
                getLog().info(
                        "ClusterManager: Scanning for instance \""
                                + rec.getSchedulerInstanceId()
                                "\"'s failed in-progress jobs.");
                //讀取該實例遺留的觸發記錄
                List<FiredTriggerRecord> firedTriggerRecs = getDelegate()
                        .selectInstancesFiredTriggerRecords(conn,
                                rec.getSchedulerInstanceId());
                int acquiredCount = 0;
                int recoveredCount = 0;
                int otherCount = 0;
                Set<TriggerKey> triggerKeys = new HashSet<TriggerKey>();
                //迭代觸發記錄
                for (FiredTriggerRecord ftRec : firedTriggerRecs) {
                    TriggerKey tKey = ftRec.getTriggerKey();
                    JobKey jKey = ftRec.getJobKey();
                    triggerKeys.add(tKey);
                    // release blocked triggers..
                    if (ftRec.getFireInstanceState().equals(STATE_BLOCKED)) {
                        getDelegate()
                                .updateTriggerStatesForJobFromOtherState(
                                        conn, jKey,
                                        STATE_WAITING, STATE_BLOCKED);
                    else if (ftRec.getFireInstanceState().equals(STATE_PAUSED_BLOCKED)) {
                        getDelegate()
                                .updateTriggerStatesForJobFromOtherState(
                                        conn, jKey,
                                        STATE_PAUSED, STATE_PAUSED_BLOCKED);
                    }
                    // release acquired triggers..
                    if (ftRec.getFireInstanceState().equals(STATE_ACQUIRED)) {
                        getDelegate().updateTriggerStateFromOtherState(
                                conn, tKey, STATE_WAITING,
                                STATE_ACQUIRED);
                        acquiredCount++;
                    //如果trigger指定的job策略爲需要recovery,那麼執行recovery
                    else if (ftRec.isJobRequestsRecovery()) {
                        // handle jobs marked for recovery that were not fully
                        // executed..
                        if (jobExists(conn, jKey)) {
                            @SuppressWarnings("deprecation")
                            //新建一次性的simpletrigger,用以重新執行需要recover的tirgger
                            SimpleTriggerImpl rcvryTrig = new SimpleTriggerImpl(
                                    "recover_"
                                            + rec.getSchedulerInstanceId()
                                            "_"
                                            + String.valueOf(recoverIds++),
                                    Scheduler.DEFAULT_RECOVERY_GROUP,
                                    new Date(ftRec.getScheduleTimestamp()));
                            rcvryTrig.setJobName(jKey.getName());
                            rcvryTrig.setJobGroup(jKey.getGroup());
                            rcvryTrig.setMisfireInstruction(SimpleTrigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY);
                            rcvryTrig.setPriority(ftRec.getPriority());
                            //讀取jobDataMap
                            JobDataMap jd = getDelegate().selectTriggerJobDataMap(conn, tKey.getName(), tKey.getGroup());
                            jd.put(Scheduler.FAILED_JOB_ORIGINAL_TRIGGER_NAME, tKey.getName());
                            jd.put(Scheduler.FAILED_JOB_ORIGINAL_TRIGGER_GROUP, tKey.getGroup());
                            jd.put(Scheduler.FAILED_JOB_ORIGINAL_TRIGGER_FIRETIME_IN_MILLISECONDS, String.valueOf(ftRec.getFireTimestamp()));
                            jd.put(Scheduler.FAILED_JOB_ORIGINAL_TRIGGER_SCHEDULED_FIRETIME_IN_MILLISECONDS, String.valueOf(ftRec.getScheduleTimestamp()));
                            rcvryTrig.setJobDataMap(jd);
                            rcvryTrig.computeFirstFireTime(null);
                            //持久化一次性的simpleTrigger,使得這個trigger可以被正常的調度流程觸發。
                            storeTrigger(conn, rcvryTrig, nullfalse,
                                    STATE_WAITING, falsetrue);
                            recoveredCount++;
                        else {
                            getLog()
                                    .warn(
                                            "ClusterManager: failed job '"
                                                    + jKey
                                                    "' no longer exists, cannot schedule recovery.");
                            otherCount++;
                        }
                    else {
                        otherCount++;
                    }
                    // free up stateful job's triggers
                    if (ftRec.isJobDisallowsConcurrentExecution()) {
                        getDelegate()
                                .updateTriggerStatesForJobFromOtherState(
                                        conn, jKey,
                                        STATE_WAITING, STATE_BLOCKED);
                        getDelegate()
                                .updateTriggerStatesForJobFromOtherState(
                                        conn, jKey,
                                        STATE_PAUSED, STATE_PAUSED_BLOCKED);
                    }
                }
                //刪除fired_trigger表中的trigger觸發記錄
                getDelegate().deleteFiredTriggers(conn,
                        rec.getSchedulerInstanceId());
                // Check if any of the fired triggers we just deleted were the last fired trigger
                // records of a COMPLETE trigger.
                int completeCount = 0;
                for (TriggerKey triggerKey : triggerKeys) {
                    if (getDelegate().selectTriggerState(conn, triggerKey).
                            equals(STATE_COMPLETE)) {
                        List<FiredTriggerRecord> firedTriggers =
                                getDelegate().selectFiredTriggerRecords(conn, triggerKey.getName(), triggerKey.getGroup());
                        if (firedTriggers.isEmpty()) {
                            if (removeTrigger(conn, triggerKey)) {
                                completeCount++;
                            }
                        }
                    }
                }
                logWarnIfNonZero(acquiredCount,
                        "ClusterManager: ......Freed " + acquiredCount
                                " acquired trigger(s).");
                logWarnIfNonZero(completeCount,
                        "ClusterManager: ......Deleted " + completeCount
                                " complete triggers(s).");
                logWarnIfNonZero(recoveredCount,
                        "ClusterManager: ......Scheduled " + recoveredCount
                                " recoverable job(s) for recovery.");
                logWarnIfNonZero(otherCount,
                        "ClusterManager: ......Cleaned-up " + otherCount
                                " other failed job(s).");
                if (!rec.getSchedulerInstanceId().equals(getInstanceId())) {
                    getDelegate().deleteSchedulerState(conn,
                            rec.getSchedulerInstanceId());
                }
            }
        catch (Throwable e) {
            throw new JobPersistenceException("Failure recovering jobs: "
                    + e.getMessage(), e);
        }
    }
}

可以看到,最終一個需要recover的節點,其未執行完整的任務最終會被其他節點已新建臨時trigger的形式重新執行。這一系列的機制正是保證quartz穩定性的可靠保證。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章