ElasticSearch - 批量更新bulk死鎖問題排查

一、問題系統介紹

監聽商品變更MQ消息，查詢商品最新的信息，調用BulkProcessor批量更新ES集羣中的商品字段信息;
由於商品數據非常多，所以將商品數據存儲到ES集羣上，整個ES集羣共劃分了256個分片，並根據商品的三級類目ID進行分片路由。

比如一個SKU的商品名稱發生變化，我們就會收到這個SKU的變更MQ消息，然後再去查詢商品接口，將商品的最新名稱查詢回來，再根據這個SKU的三級分類ID進行路由，找到對應的ES集羣分片，然後更新商品名稱字段信息。

由於商品變更MQ消息量巨大，爲了提升更新ES的性能，防止出現MQ消息積壓問題，所以本系統使用了BulkProcessor進行批量異步更新。

ES客戶端版本如下：

 <dependency>     <artifactId>elasticsearch-rest-client</artifactId>     <groupId>org.elasticsearch.client</groupId>     <version>6.5.3</version> </dependency>

BulkProcessor配置僞代碼如下：

//在這裏調用build()方法構造bulkProcessor,在底層實際上是用了bulk的異步操作 this.fullDataBulkProcessor = BulkProcessor.builder((request, bulkListener) ->      fullDataEsClient.getClient().bulkAsync(request, RequestOptions.DEFAULT, bulkListener), listener)      // 1000條數據請求執行一次bulk      .setBulkActions(1000)      // 5mb的數據刷新一次bulk      .setBulkSize(new ByteSizeValue(5L, ByteSizeUnit.MB))      // 併發請求數量, 0不併發, 1併發允許執行      .setConcurrentRequests(1)      // 固定1s必須刷新一次      .setFlushInterval(TimeValue.timeValueSeconds(1L))      // 重試5次，間隔1s      .setBackoffPolicy(BackoffPolicy.constantBackoff(TimeValue.timeValueSeconds(1L), 5))      .build();

二、問題怎麼發現的

618大促開始後，由於商品變更MQ消息非常頻繁，MQ消息每天的消息量更是達到了日常的數倍，而且好多商品還變更了三級類目ID；
系統在更新這些三級類目ID發生變化的SKU商品信息時，根據修改後的三級類目ID路由後的分片更新商品信息時發生了錯誤，並且重試了5次，依然沒有成功；
因爲在新路由的分片上沒有這個商品的索引信息，這些更新請求永遠也不會執行成功，系統的日誌文件中也記錄了大量的異常重試日誌。
商品變更MQ消息也開始出現了積壓報警，MQ消息的消費速度明顯趕不上生產速度。
觀察MQ消息消費者的UMP監控數據，發現消費性能很平穩，沒有明顯波動，但是調用次數會在系統消費MQ一段時間後出現斷崖式下降，由原來的每分鐘幾萬調用量逐漸下降到個位數。
在重啓應用後，系統又開始消費，UMP監控調用次數恢復到正常水平，但是系統運行一段時間後，還是會出現消費暫停問題，彷彿所有消費線程都被暫停了一樣。

三、排查問題的詳細過程

首先找一臺暫停消費MQ消息的容器，查看應用進程ID，使用jstack命令dump應用進程的整個線程堆棧信息，將導出的線程堆棧信息打包上傳到 https://fastthread.io/ 進行線程狀態分析。分析報告如下：

通過分析報告發現有124個處於BLOCKED狀態的線程，然後可以點擊查看各線程的詳細堆棧信息，堆棧信息如下：

連續查看多個線程的詳細堆棧信息，MQ消費線程都是在waiting to lock <0x00000005eb781b10> (a org.elasticsearch.action.bulk.BulkProcessor)，然後根據0x00000005eb781b10去搜索發現，這個對象鎖正在被另外一個線程佔用，佔用線程堆棧信息如下:

這個線程狀態此時正處於WAITING狀態，通過線程名稱發現，該線程應該是ES客戶端內部線程。正是該線程搶佔了業務線程的鎖，然後又在等待其他條件觸發該線程執行，所以導致了所有的MQ消費業務線程一直無法獲取BulkProcessor內部的鎖，導致出現了消費暫停問題。

但是這個線程elasticsearch[scheduler][T#1]爲啥不能執行？它是什麼時候啓動的？又有什麼作用？

就需要我們對BulkProcessor進行深入分析,由於BulkProcessor是通過builder模塊進行創建的，所以深入builder源碼，瞭解一下BulkProcessor的創建過程。

  
  
  
   
   
   public static Builder builder(BiConsumer<BulkRequest, ActionListener<BulkResponse>> consumer, Listener listener) {        Objects.requireNonNull(consumer, "consumer");        Objects.requireNonNull(listener, "listener");        final ScheduledThreadPoolExecutor scheduledThreadPoolExecutor = Scheduler.initScheduler(Settings.EMPTY);        return new Builder(consumer, listener,                (delay, executor, command) -> scheduledThreadPoolExecutor.schedule(command, delay.millis(), TimeUnit.MILLISECONDS),                () -> Scheduler.terminate(scheduledThreadPoolExecutor, 10, TimeUnit.SECONDS));    }

內部創建了一個時間調度執行線程池，線程命名規則和上述持有鎖的線程名稱相似，具體代碼如下：

static ScheduledThreadPoolExecutor initScheduler(Settings settings) {        ScheduledThreadPoolExecutor scheduler = new ScheduledThreadPoolExecutor(1,                EsExecutors.daemonThreadFactory(settings, "scheduler"), new EsAbortPolicy());        scheduler.setExecuteExistingDelayedTasksAfterShutdownPolicy(false);        scheduler.setContinueExistingPeriodicTasksAfterShutdownPolicy(false);        scheduler.setRemoveOnCancelPolicy(true);        return scheduler;    }

最後在build方法內部執行了BulkProcessor的內部有參構造方法，在構造方法內部啓動了一個週期性執行的flushing任務，代碼如下

 BulkProcessor(BiConsumer<BulkRequest, ActionListener<BulkResponse>> consumer, BackoffPolicy backoffPolicy, Listener listener,                  int concurrentRequests, int bulkActions, ByteSizeValue bulkSize, @Nullable TimeValue flushInterval,                  Scheduler scheduler, Runnable onClose) {        this.bulkActions = bulkActions;        this.bulkSize = bulkSize.getBytes();        this.bulkRequest = new BulkRequest();        this.scheduler = scheduler;        this.bulkRequestHandler = new BulkRequestHandler(consumer, backoffPolicy, listener, scheduler, concurrentRequests);        // Start period flushing task after everything is setup        this.cancellableFlushTask = startFlushTask(flushInterval, scheduler);        this.onClose = onClose;    }

private Scheduler.Cancellable startFlushTask(TimeValue flushInterval, Scheduler scheduler) {        if (flushInterval == null) {            return new Scheduler.Cancellable() {                @Override                public void cancel() {}
                @Override                public boolean isCancelled() {                    return true;                }            };        }        final Runnable flushRunnable = scheduler.preserveContext(new Flush());        return scheduler.scheduleWithFixedDelay(flushRunnable, flushInterval, ThreadPool.Names.GENERIC);    }

class Flush implements Runnable {
        @Override        public void run() {            synchronized (BulkProcessor.this) {                if (closed) {                    return;                }                if (bulkRequest.numberOfActions() == 0) {                    return;                }                execute();            }        }    }

通過源代碼發現，該flush任務就是在創建BulkProcessor對象時設置的固定時間flush邏輯，當setFlushInterval方法參數生效，就會啓動一個後臺定時flush任務。flush間隔，由setFlushInterval方法參數定義。該flush任務在運行期間，也會搶佔BulkProcessor對象鎖，搶到鎖後，纔會執行execute方法。具體的方法調用關係源代碼如下：

/**     * Adds the data from the bytes to be processed by the bulk processor     */    public synchronized BulkProcessor add(BytesReference data, @Nullable String defaultIndex, @Nullable String defaultType,                                          @Nullable String defaultPipeline, @Nullable Object payload, XContentType xContentType) throws Exception {        bulkRequest.add(data, defaultIndex, defaultType, null, null, null, defaultPipeline, payload, true, xContentType);        executeIfNeeded();        return this;    }
    private void executeIfNeeded() {        ensureOpen();        if (!isOverTheLimit()) {            return;        }        execute();    }
    // (currently) needs to be executed under a lock    private void execute() {        final BulkRequest bulkRequest = this.bulkRequest;        final long executionId = executionIdGen.incrementAndGet();
        this.bulkRequest = new BulkRequest();        this.bulkRequestHandler.execute(bulkRequest, executionId);    }

而上述代碼中的add方法，則是由MQ消費業務線程去調用，在該方法上同樣有一個synchronized關鍵字，所以消費MQ業務線程會和flush任務執行線程直接會存在鎖競爭關係。具體MQ消費業務線程調用僞代碼如下：

@Override public void upsertCommonSku(CommonSkuEntity commonSkuEntity) {            String source = JsonUtil.toString(commonSkuEntity);            UpdateRequest updateRequest = new UpdateRequest(Constants.INDEX_NAME_SPU, Constants.INDEX_TYPE, commonSkuEntity.getSkuId().toString());            updateRequest.doc(source, XContentType.JSON);            IndexRequest indexRequest = new IndexRequest(Constants.INDEX_NAME_SPU, Constants.INDEX_TYPE, commonSkuEntity.getSkuId().toString());            indexRequest.source(source, XContentType.JSON);            updateRequest.upsert(indexRequest);            updateRequest.routing(commonSkuEntity.getCat3().toString());            fullbulkProcessor.add(updateRequest);}

通過以上對線程堆棧分析，發現所有的業務線程都在等待elasticsearch[scheduler][T#1]線程釋放BulkProcessor對象鎖，但是該線程確一直沒有釋放該對象鎖，從而出現了業務線程的死鎖問題。

結合應用日誌文件中出現的大量異常重試日誌，可能與BulkProcessor的異常重試策略有關，然後進一步瞭解BulkProcessor的異常重試代碼邏輯。由於業務線程中提交BulkRequest請求都統一提交到了BulkRequestHandler對象中的execute方法內部進行處理，代碼如下：

  
  
  
   
   
   public final class BulkRequestHandler {    private final Logger logger;    private final BiConsumer<BulkRequest, ActionListener<BulkResponse>> consumer;    private final BulkProcessor.Listener listener;    private final Semaphore semaphore;    private final Retry retry;    private final int concurrentRequests;
    BulkRequestHandler(BiConsumer<BulkRequest, ActionListener<BulkResponse>> consumer, BackoffPolicy backoffPolicy,                       BulkProcessor.Listener listener, Scheduler scheduler, int concurrentRequests) {        assert concurrentRequests >= 0;        this.logger = Loggers.getLogger(getClass());        this.consumer = consumer;        this.listener = listener;        this.concurrentRequests = concurrentRequests;        this.retry = new Retry(backoffPolicy, scheduler);        this.semaphore = new Semaphore(concurrentRequests > 0 ? concurrentRequests : 1);    }
    public void execute(BulkRequest bulkRequest, long executionId) {        Runnable toRelease = () -> {};        boolean bulkRequestSetupSuccessful = false;        try {            listener.beforeBulk(executionId, bulkRequest);            semaphore.acquire();            toRelease = semaphore::release;            CountDownLatch latch = new CountDownLatch(1);            retry.withBackoff(consumer, bulkRequest, new ActionListener<BulkResponse>() {                @Override                public void onResponse(BulkResponse response) {                    try {                        listener.afterBulk(executionId, bulkRequest, response);                    } finally {                        semaphore.release();                        latch.countDown();                    }                }
                @Override                public void onFailure(Exception e) {                    try {                        listener.afterBulk(executionId, bulkRequest, e);                    } finally {                        semaphore.release();                        latch.countDown();                    }                }            });            bulkRequestSetupSuccessful = true;            if (concurrentRequests == 0) {                latch.await();            }        } catch (InterruptedException e) {            Thread.currentThread().interrupt();            logger.info(() -> new ParameterizedMessage("Bulk request {} has been cancelled.", executionId), e);            listener.afterBulk(executionId, bulkRequest, e);        } catch (Exception e) {            logger.warn(() -> new ParameterizedMessage("Failed to execute bulk request {}.", executionId), e);            listener.afterBulk(executionId, bulkRequest, e);        } finally {            if (bulkRequestSetupSuccessful == false) {  // if we fail on client.bulk() release the semaphore                toRelease.run();            }        }    }
    boolean awaitClose(long timeout, TimeUnit unit) throws InterruptedException {        if (semaphore.tryAcquire(this.concurrentRequests, timeout, unit)) {            semaphore.release(this.concurrentRequests);            return true;        }        return false;    }}

BulkRequestHandler通過構造方法初始化了一個Retry任務對象，該對象中也傳入了一個Scheduler，且該對象和flush任務中傳入的是同一個線程池，該線程池內部只維護了一個固定線程。而execute方法首先會先根據Semaphore來控制併發執行數量，該併發數量在構建BulkProcessor時通過參數指定，通過上述配置發現該值配置爲1。所以每次只允許一個線程執行該方法。即MQ消費業務線程和flush任務線程，同一時間只能有一個線程可以執行。然後下面再瞭解一下重試任務是如何執行的，具體看如下代碼：

public void withBackoff(BiConsumer<BulkRequest, ActionListener<BulkResponse>> consumer, BulkRequest bulkRequest,                            ActionListener<BulkResponse> listener) {        RetryHandler r = new RetryHandler(backoffPolicy, consumer, listener, scheduler);        r.execute(bulkRequest);    }

RetryHandler內部會執行提交bulkRequest請求，同時也會監聽bulkRequest執行異常狀態，然後執行任務重試邏輯，重試代碼如下：

  
  
  
   
   
   private void retry(BulkRequest bulkRequestForRetry) {            assert backoff.hasNext();            TimeValue next = backoff.next();            logger.trace("Retry of bulk request scheduled in {} ms.", next.millis());            Runnable command = scheduler.preserveContext(() -> this.execute(bulkRequestForRetry));            scheduledRequestFuture = scheduler.schedule(next, ThreadPool.Names.SAME, command);        }

RetryHandler將執行失敗的bulk請求重新交給了內部scheduler線程池去執行，通過以上代碼瞭解，該線程池內部只維護了一個固定線程，同時該線程池可能還會被另一個flush任務去佔用執行。所以如果重試邏輯正在執行的時候，此時線程池內的唯一線程正在執行flush任務，則會阻塞重試邏輯執行，重試邏輯不能執行完成，則不會釋放Semaphore，但是由於併發數量配置的是1，所以flush任務線程需要等待其他線程釋放一個Semaphore許可後才能繼續執行。所以此處形成了循環等待，導致Semaphore和BulkProcessor對象鎖都無法釋放，從而使得所有的MQ消費業務線程都阻塞在獲取BulkProcessor鎖之前。

同時，在GitHub的ES客戶端源碼客戶端上也能搜索到類似問題，例如：https://github.com/elastic/elasticsearch/issues/47599 ，所以更加印證了之前的猜想，就是因爲bulk的不斷重試從而引發了BulkProcessor內部的死鎖問題。

四、如何解決問題

既然前邊已經瞭解到了問題產生的原因，所以就有了如下幾種解決方案：

升級ES客戶端版本到7.6正式版，後續版本通過將異常重試任務線程池和flush任務線程池進行了物理隔離，從而避免了線程池的競爭，但是需要考慮版本兼容性。
由於該死鎖問題是由大量異常重試邏輯引起的，可以在不影響業務邏輯的情況取消重試邏輯，該方案可以不需要升級客戶端版本，但是需要評估業務影響，執行失敗的請求可以通過其他其他方式進行業務重試。

如有疏漏不妥之處，歡迎指正！

-end-

本文分享自微信公衆號 - 京東雲開發者（JDT_Developers）。
如有侵權，請聯繫 [email protected] 刪除。
本文參與“OSC源創計劃”，歡迎正在閱讀的你也加入，一起分享。

ElasticSearch - 批量更新bulk死鎖問題排查

PDManer [元數建模]-v4.9.0 發佈：一款簡單好用的數據庫建模平臺

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

sql求連續值問題

cs01 CSS Syntax

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

[MASM拾遺]Offset僞指令

h30 HTML Layout Elements

瞭解顯卡

一款基於C#開發的通訊調試工具（支持Modbus RTU、MQTT調試）

Linux/Golang/glibC系統調用

如何優雅的使用ollama

OSS_PIPE：Rust編寫的大規模文件遷移工具

企業IT架構治理之道

AIGC在京東廣告創意的技術應用

什麼？ 20分鐘，構建你自己的LLaMA3應用程序！

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結