Java併發——ThreadPoolExecutor源碼解析

本文總結一下對線程池源碼的學習，基於jdk 1.8

什麼是線程池

顧名思義線程池就是一個可以提供一組可複用線程的對象。線程池內部有阻塞隊列，用來存放等待執行的任務。然後內部的線程來執行這些任務，線程會不斷的從阻塞隊列中獲取任務來執行，而不是執行完一個任務就銷燬。

線程池的作用

在高併發場景下，如果給每個任務都去創建一個線程來執行，結果就是大量的線程創建與銷燬，系統的開銷將會很大，影響應用的執行效率。

同時，線程池可以有效的限制應用程序中同一時刻運行的線程數量，避免CPU資源不足，造成阻塞。

線程池的使用

定義一個線程池

ExecutorService executor = new ThreadPoolExecutor(1, 4, 20,
                TimeUnit.SECONDS, new ArrayBlockingQueue<>(10));

該線程池，核心線程數爲1，最大線程數爲4，非核心線程空閒存活時間20s，阻塞隊列是長度爲10的 ArrayBlockingQueue，線程工廠和飽和拒絕策略沒有定義，採用默認實現

線程池中添加任務
Executor接口提供了execute方法，傳入Runnable接口的實現（任務），線程池將會調度執行這些任務

for (int i = 0; i < 12; i++) {
    executor.execute(() -> 
        System.out.println(Thread.currentThread().getName()));
}

ThreadPoolExecutor源碼分析

定義線程池

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
        if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();
        if (workQueue == null || threadFactory == null || handler == null)
            throw new NullPointerException();
        this.acc = System.getSecurityManager() == null ?
                null :
                AccessController.getContext();
        this.corePoolSize = corePoolSize;
        this.maximumPoolSize = maximumPoolSize;
        this.workQueue = workQueue;
        this.keepAliveTime = unit.toNanos(keepAliveTime);
        this.threadFactory = threadFactory;
        this.handler = handler;
    }

這是線程池最終執行的構造方法，共有7個參數，分別是

核心線程數
最大線程數(核心線程+非核心線程)
非核心線程空閒存活時間
空閒存活時間單位
阻塞隊列
線程工廠
飽和拒絕策略

在定義時前5個參數是必須傳遞的，後兩個參數不傳遞表示使用默認提供
注意看第三個參數，默認它是作用在非核心線程上的，如果希望同時作用在覈心線程上，可以調用如下方法設置

allowCoreThreadTimeOut(true);

#####線程池的狀態
下面來看一下線程池內部的一些狀態，以及工作線程數的封裝

    /**
     * The main pool control state, ctl, is an atomic integer packing
     * two conceptual fields
     *   workerCount, indicating the effective number of threads
     *   runState,    indicating whether running, shutting down etc
     *
     * In order to pack them into one int, we limit workerCount to
     * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
     * billion) otherwise representable. 
     */
    private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
    private static final int COUNT_BITS = Integer.SIZE - 3;
    private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

    // runState is stored in the high-order bits
    private static final int RUNNING    = -1 << COUNT_BITS;
    private static final int SHUTDOWN   =  0 << COUNT_BITS;
    private static final int STOP       =  1 << COUNT_BITS;
    private static final int TIDYING    =  2 << COUNT_BITS;
    private static final int TERMINATED =  3 << COUNT_BITS;

    // Packing and unpacking ctl
    private static int runStateOf(int c)     { return c & ~CAPACITY; }
    private static int workerCountOf(int c)  { return c & CAPACITY; }
    private static int ctlOf(int rs, int wc) { return rs | wc; }

節選了部分關鍵註釋說明

ctl是一個原子的Integer類型，包含了workerCount和runState，爲了把這個兩個值拼到一個int中，限制了workerCount最大爲2^29 -1，大約爲500多萬，而不是2^31-1。

也就是說作者把工作線程數和狀態值拼接到了一個int中，這些屬性含義如下

屬性	含義	值
COUNT_BITS	2進制計數位數	29
CAPACITY	線程數容量	(2^29)-1
RUNNING	運行狀態	-2^29
SHUTDOWN	關閉狀態(不接受新任務，把已有任務執行完)	0
STOP	停止(不接受新任務，終止正在執行的)	2^29
TIDYING	所有任務終止，工作線程數爲0	2^30
TERMINATED	terminated()方法執行完成	2^29 + 2^30

通過上表可以看到，線程池的5個狀態數值是遞增的
所以只要狀態是>=SHUTDOWN，就代表線程池不會再接受新的任務

三個靜態方法解釋

ctlOf(int rs, int wc)
線程池狀態與線程數拼成一個int，高3位爲狀態，低29位爲工作線程數
runStateOf(int c)
獲取線程池狀態
workerCountOf(int c)
獲取工作線程數

#####執行任務—execute

public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        /*
         * Proceed in 3 steps:
         *
         * 1. If fewer than corePoolSize threads are running, try to
         * start a new thread with the given command as its first
         * task.  The call to addWorker atomically checks runState and
         * workerCount, and so prevents false alarms that would add
         * threads when it shouldn't, by returning false.
         *
         * 2. If a task can be successfully queued, then we still need
         * to double-check whether we should have added a thread
         * (because existing ones died since last checking) or that
         * the pool shut down since entry into this method. So we
         * recheck state and if necessary roll back the enqueuing if
         * stopped, or start a new thread if there are none.
         *
         * 3. If we cannot queue task, then we try to add a new
         * thread.  If it fails, we know we are shut down or saturated
         * and so reject the task.
         */
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
        else if (!addWorker(command, false))
            reject(command);
    }

可以看到，Doug Lea老爺子已經將該方法的流程註釋的很清晰了，我這裏就通俗的描述一下：

如果運行的線程數小於核心線程數，那麼就新啓動一個線程，並將該任務作爲此線程的firstTask
若線程池的核心線程數已經滿了，就將任務添加到阻塞隊列中，需要二次檢查(因爲有可能在上一次檢查之後死掉，或者是進入該方法時線程池被關閉)，若線程池不是運行狀態，則將該任務從隊列中移除，並進行拒絕處理。如果二次檢查後沒有工作的線程了，那麼就新啓動一個線程執行該任務
如果阻塞隊列也滿了，就新啓動一個非核心線程，如果失敗的話，說明線程池被shutdown或者是隊列容量和最大線程數都已達到上限，將此任務拒絕掉

添加工作線程—addWorker

private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;

        for (;;) {
            // 獲取工作線程數
            int wc = workerCountOf(c);
            // 如果大於CAPACITY最大容量，或者core爲true，與corePoolSize比，
            // 否則與maximumPoolSize比較，如果大於允許的線程數則返回 false
            if (wc >= CAPACITY ||
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            // worker + 1成功，跳出retry外層循環
            if (compareAndIncrementWorkerCount(c))
                break retry;
            // cas操作失敗，如果線程池狀態改變，跳出內層循環，繼續判斷狀態
            c = ctl.get();  // Re-read ctl
            if (runStateOf(c) != rs)
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }

    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        w = new Worker(firstTask);
        final Thread t = w.thread;
        if (t != null) {
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                // 拿到鎖以後二次檢查
                int rs = runStateOf(ctl.get());
                
                // 如果在運行狀態，或者是SHUTDOWN狀態且firstTask爲空（取queue中任務）
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    // 線程已經啓動，拋出異常
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    // 添加到workers中
                    workers.add(w);
                    // 記錄最大的worker數量
                    int s = workers.size();
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                mainLock.unlock();
            }
            // 啓動線程
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        // 添加失敗
        if (! workerStarted)
            addWorkerFailed(w);
    }
    return workerStarted;
}

firstTask
addWorker方法的第一個參數是firstTask，firstTask是線程池中Worker對象的一個屬性，該對象代表新啓動線程的第一個任務。
在execute方法源碼中可以看到，只有在新增線程時纔會給firstTask賦值，如果任務被添加到queue中，將其置爲null，線程會去阻塞隊列中獲取任務執行。

在添加worker前，會在有必要的情況下檢查阻塞隊列是否爲空

if (rs >= SHUTDOWN &&
    ! (rs == SHUTDOWN &&
       firstTask == null &&
       ! workQueue.isEmpty()))
    return false;

1、如果狀態大於SHUTDOWN，不接受新任務，直接返回false；
2、如果狀態等於SHUTDOWN，firstTask != null，返回false，不允許新增任務；
2、如果狀態等於SHUTDOWN，firstTask == null，說明該線程會去隊列中取任務執行，如果此時workQueue.isEmpty()，則返回false；

addWorkerFailed
添加線程失敗時，會將剛創建的worker對象移除掉

private void addWorkerFailed(Worker w) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        // HashSet中移除worker
        if (w != null)
            workers.remove(w);
        // 線程數減一
        decrementWorkerCount();
        // 嘗試關閉線程池
        tryTerminate();
    } finally {
        mainLock.unlock();
    }
}

內部線程包裝對象—Worker

Worker是ThreadPoolExecutor中的內部類，包含了執行任務的線程（節選了部分屬性和方法）

private final class Worker extends AbstractQueuedSynchronizer
    implements Runnable {
    /** 執行任務的線程 */
    final Thread thread;
    /** 初始化運行的任務，可能爲空 */
    Runnable firstTask;
    /** 每個worker完成任務的計數器 */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
    Worker(Runnable firstTask) {
        // 無法獲取鎖，從而禁止 interrupt worker
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        // 線程工廠初始化線程
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    public void run() {
        runWorker(this);
    }
}

Worker對象通過繼承AbstractQueuedSynchronizer隊列同步器，來控制worker的同步狀態，
新建worker時，setState(-1) ，設置狀態爲 -1 使得其他線程無法獲取到worker的鎖，禁止interrupt該線程（只有當前狀態爲 0 時纔有機會獲得鎖）

執行任務——runWorker
線程池真正執行任務的地方就在這裏了

final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    Runnable task = w.firstTask;
    w.firstTask = null;
    w.unlock(); // allow interrupts
    // 標記線程是否是因爲發生異常中斷的
    boolean completedAbruptly = true;
    try {
        // 獲取要執行的任務, firstTask爲空就從隊列中取
        while (task != null || (task = getTask()) != null) {
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                // 鉤子函數，可以在執行前自定義些操作
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    // 執行任務
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    // 鉤子函數，任務執行完成後調用的方法
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        // 執行成功，將異常標記置爲 false
        completedAbruptly = false;
    } finally {
        // 執行worker退出操作
        processWorkerExit(w, completedAbruptly);
    }
}

線程池狀態檢查

if ((runStateAtLeast(ctl.get(), STOP) ||
             (Thread.interrupted() &&
              runStateAtLeast(ctl.get(), STOP))) &&
            !wt.isInterrupted())
            wt.interrupt();

1、如果runState >= stop(stop狀態線程池要中斷正在運行的任務)，且線程未被設置爲中斷，則interrupt線程
2、如果runState < stop，進行二次檢查(有可能在第一次獲取狀態後，調用了shutdownNow方法)，此時線程如果有中斷標記，則清除(Thread.interrupted()返回線程中斷狀態，並將其清除)，再次查看狀態，runSate >= stop 則interrupt線程

獲取任務——getTask

private Runnable getTask() {
    // 上一次循環取task是否超時
    boolean timedOut = false; 

    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // 檢查線程池狀態
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            decrementWorkerCount();
            return null;
        }

        int wc = workerCountOf(c);

        // 線程空閒了是否需要退出
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

        // 檢查線程池workerCount
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
        }

        try {
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            if (r != null)
                return r;
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

1、循環取任務，直到取到任務，或者是不需要返回任務爲止；
2、如果線程池是 > stop狀態，則workerCount減1，返回null，
如果是shutdown狀態，且隊列爲空，則workerCount減1，返回null
3、(wc > maximumPoolSize || (timed && timedOut))
wc > 最大線程數或者是線程空閒了keepAliveTime 且空閒需被銷燬
(wc > 1 || workQueue.isEmpty())
wc > 1 或者隊列爲空
同時滿足上述兩個條件，說明該線程不需要獲取任務來執行，則workerCount減1，返回null

timedOut代表上一次循環中，取task時候是否超時(代表了該線程空閒了keepAliveTime時間)
timed代表該線程空閒了是否需要銷燬

4、如果timed == true，則調用poll方法，等待keepAliveTime時間，
否則調用take方法阻塞直到獲取到任務(到達這一步，說明線程池狀態爲running或者是shutdown且workQueue不爲空)

worker退出——processWorkerExit
任務執行完成後，在finally語句中執行worker的退出操作

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    // 如果線程異常退出，則workerCount減 1
    if (completedAbruptly)
        decrementWorkerCount();

    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        completedTaskCount += w.completedTasks;
        // set 中異常執行完的worker對象
        workers.remove(w);
    } finally {
        mainLock.unlock();
    }
    // 嘗試停止線程池
    tryTerminate();

    int c = ctl.get();
    // 狀態爲running或者shutdown
    if (runStateLessThan(c, STOP)) {
        // 異常退出直接新增加一個worker
        if (!completedAbruptly) {
            // 計算最小線程數
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            // 當前工作線程大於min，則無需新增，直接返回
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        addWorker(null, false);
    }
}

該方法中有個很重要的操作就是調用tryTerminate方法，嘗試終止線程池
接下來就來分析線程池的關閉操作
tryTerminate + awaitTermination

final void tryTerminate() {
    for (;;) {
        int c = ctl.get();
        // 1、stop狀態，則往下執行
        // 2、shutdown且隊列爲空則往下執行，其餘情況直接return
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        // 如果工作線程數不爲空，則中斷一個worker線程
        if (workerCountOf(c) != 0) { // Eligible to terminate
            interruptIdleWorkers(ONLY_ONE);
            return;
        }

        // 執行到這裏，說明worker爲0，且沒有任務需要執行
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            // 設置線程池狀態爲tidying
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    // 調用鉤子函數，需繼承在子類中實現
                    terminated();
                } finally {
                    // 設置線程池狀態爲terminated
                    ctl.set(ctlOf(TERMINATED, 0));
                    // 線程池終止完成信號通知，通知awaitTermination方法
                    termination.signalAll();
                }
                return;
            }
        } finally {
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

public boolean awaitTermination(long timeout, TimeUnit unit)
    throws InterruptedException {
    // 設置的阻塞超時時間
    long nanos = unit.toNanos(timeout);
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        for (;;) {
            // 如果線程池已經關閉，直接返回
            if (runStateAtLeast(ctl.get(), TERMINATED))
                return true;
            if (nanos <= 0)
                return false;
            // 阻塞，如果tryTerminate方法關閉成功的話，會喚醒這裏
            nanos = termination.awaitNanos(nanos);
        }
    } finally {
        mainLock.unlock();
    }
}

該方法中比較重要的一步操作就是中斷空閒線程interruptIdleWorkers(ONLY_ONE)

/**
 * @param onlyOne If true, interrupt at most one worker. This is
 * called only from tryTerminate when termination is otherwise
 * enabled but there are still other workers.  In this case, at
 * most one waiting worker is interrupted to propagate shutdown
 * signals in case all threads are currently waiting.
 * Interrupting any arbitrary thread ensures that newly arriving
 * workers since shutdown began will also eventually exit.
 * To guarantee eventual termination, it suffices to always
 * interrupt only one idle worker, but shutdown() interrupts all
 * idle workers so that redundant workers exit promptly, not
 * waiting for a straggler task to finish.
 */
private void interruptIdleWorkers(boolean onlyOne) {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        for (Worker w : workers) {
            Thread t = w.thread;
            // 如果線程未中斷，且可以獲取到鎖，則interrupt
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                    w.unlock();
                }
            }
            // 如果該值爲true，則跳出循環
            if (onlyOne)
                break;
        }
    } finally {
        mainLock.unlock();
    }
}

粗略翻譯一下方法上面的註釋，如果onlyOne參數被設置爲true的話，該方法最多隻會中斷一個worker線程，爲了把shutdown信號傳播下去，保證線程池最終的關閉，最多就只中斷一個空閒線程。
線程阻塞的話就是阻塞在getTask方法中，這裏中斷一個線程後，getTask --> processWorkerExit --> tryTerminate --> interruptIdleWorkers --> getTask
其實tryTerminate方法中，爲什麼要設置onlyOne爲true，如果那個地方是false會是什麼結果，沒有思考的很明白，後續多查閱些資料實踐一下。
其實上面已經涉及到了線程池的關閉流程，下面還有兩個比較重要的方法來分析一下

關閉線程池——shutdown

public void shutdown() {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        checkShutdownAccess();
        // 將線程池狀態設置爲shutdown
        advanceRunState(SHUTDOWN);
        // 中斷所有空閒的線程
        interruptIdleWorkers();
        onShutdown(); // hook for ScheduledThreadPoolExecutor
    } finally {
        mainLock.unlock();
    }
    // 嘗試關閉線程池
    tryTerminate();
}

這個地方的巧妙之處就在於最後的tryTerminate方法，因爲線程池shutdown狀態時，是要把剩下的任務執行完的，如果調shutdown方法的時候恰好所有線程都在執行任務，那麼就無法中斷。
關閉線程池——shutdownNow

public List<Runnable> shutdownNow() {
    List<Runnable> tasks;
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        checkShutdownAccess();
        // 設置線程池狀態爲stop
        advanceRunState(STOP);
        // 中斷所有worker
        interruptWorkers();
        // 取出隊列中的任務並返回
        tasks = drainQueue();
    } finally {
        mainLock.unlock();
    }
    // 嘗試關閉線程池
    tryTerminate();
    return tasks;
}

stop狀態需要把所有線程中斷，任務也放棄，所有shutdownNow會中斷所有worker線程

private void interruptWorkers() {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        for (Worker w : workers)
            w.interruptIfStarted();
    } finally {
        mainLock.unlock();
    }
}

// 此方法是內部類Worker中的方法，提到這裏來便於閱讀
    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }

持有worker鎖時，state 爲 1，未持有鎖時爲 0，所以這裏就可以看出區別，shutdown方法是隻能中斷空閒的worker線程，而shutdownNow則是把所有worker線程都中斷。

線程池的基本流程就到這裏了，如果有理解的不對的地方，或者需要補充的地方，還望各位小夥伴不吝賜教 ^-^ ~

Java併發——ThreadPoolExecutor源碼解析

什麼是線程池

線程池的作用

線程池的使用

ThreadPoolExecutor源碼分析

定義線程池

添加工作線程—addWorker

內部線程包裝對象—Worker

spring boot 2.1.x log4j2 配置

斐波那契數列的遞歸與尾遞歸

Java併發——AQS源碼解析

Integer中valueOf與parseInt區別及其緩存策略

AQS應用——ReentrantLock源碼分析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結