kafka Consumer詳解

原創

2020-02-23 15:46

1.ZookeeperConsumer架構

ZookeeperConsumer類中consumer運行過程架構圖：

圖1

過程分析：

ConsumerGroupExample類

2.消費者線程(consumer thread),隊列，拉取線程(fetch thread)三者之間關係

每一個topic至少需要創建一個consumer thread，如果有多個partitions，則可以創建多個consumer thread線程，consumer thread>==partitions數量，否則會有consumer thread空閒。

部分代碼示例如下：

ConsumerConnector consumer

consumer = kafka.consumer.Consumer.createJavaConsumerConnector(

createConsumerConfig());

Map<String, Integer> topicCountMap = new HashMap<String, Integer>();

topicCountMap.put("test-string-topic", new Integer(1)); //value表示consumer thread線程數量

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);

具體說明一下三者關係：

(1).topic的partitions分佈規則

paritions是安裝kafka brokerId有序分配的。

例如現在有三個node安裝了kafka broker服務端程序，brokerId分別設置爲1,2,3，現在準備一個topic爲test-string-topic，並且分配12個partitons，此時partitions的kafka broker節點分佈情況爲，partitions索引編號爲0,3,6,9等4個partitions在brokerId=1上，1,4,7,10在brokerId=2上，2,5,8,11在brokerId=3上。

創建consumer thread

consumer thread數量與BlockingQueue一一對應。

a.當consumer thread count=1時

此時有一個blockingQueue1，三個fetch thread線程，該topic分佈在幾個node上就有幾個fetch thread，每個fetch thread會於kafka broker建立一個連接。3個fetch thread線程去拉取消息數據，最終放到blockingQueue1中，等待consumer thread來消費。

消費者線程，緩衝隊列，partitions分佈列表如下

consumer線程	Blocking Queue	partitions
consumer thread1	blockingQueue1	0,1,2,3,4,5,6,7,8,9,10,11

fetch thread與partitions分佈列表如下

fetch線程	partitions
fetch thread1	0,3,6,9
fetch thread2	1,4,7,10
fetch thread3	2,5,8,11

b. 當consumer thread count=2時

此時有consumerThread1和consumerThread2分別對應2個隊列blockingQueue1，blockingQueue2,這2個消費者線程消費partitions依次爲:0,1,2,3,4,5與6,7,8,9,10,11;消費者線程，緩衝隊列，partitions分佈列表如下

consumer線程	Blocking Queue	partitions
consumer thread1	blockingQueue1	0,1,2,3,4,5
consumer thread2	blockingQueue2	6,7,8,9,10,11

fetch thread與partitions分佈列表如下

fetch線程	partitions
fetch thread1	0,3,6,9
fetch thread2	1,4,7,10
fetch thread3	2,5,8,11

c. 當consumer thread count=4時

消費者線程，緩衝隊列，partitions分佈列表如下

consumer線程	Blocking Queue	partitions
consumer thread1	blockingQueue1	0,1,2
consumer thread2	blockingQueue2	3,4,5
consumer thread3	blockingQueue3	6,7,8
consumer thread4	blockingQueue4	9,10,11

fetch thread與partitions分佈列表如下

同上

同理當消費線程consumer thread count=n,都是安裝上述分佈規則來處理的。

3.consumer消息線程以及隊列創建邏輯

運用ZookeeperConsumerConnector類創建多線程並行消費測試類，ConsumerGroupExample類初始化，調用createMessageStreams方法，實際是在consume方法處理的邏輯，創建KafkaStream,以及阻塞隊列(LinkedBlockingQueue),KafkaStream與隊列個數一一對應，消費者線程數量決定阻塞隊列的個數。

registerConsumerInZK()方法：設置消費者組，註冊消費者信息consumerIdString到zookeeper上。

consumerIdString產生規則部分代碼如下:

[java]view
plain copy

String consumerUuid = null;  

if(config.consumerId!=null && config.consumerId)  

 consumerUuid = consumerId;  

else {   

 String uuid = UUID.randomUUID()  

  consumerUuid = "%s-%d-%s".format(  

    InetAddress.getLocalHost.getHostName, System.currentTimeMillis,  

    uuid.getMostSignificantBits().toHexString.substring(0,8));  

}      

String consumerIdString =  config.groupId + "_" + consumerUuid;

kafka zookeeper註冊模型結構或存儲結構如下：

kafka在zookeeper中存儲結構

說明：目前把kafka中絕大部分存儲模型都列表出來了，當前還有少量不常使用的，暫時還沒有列舉，後續會加上。

consumer初始化邏輯處理：

1.實例化並註冊loadBalancerListener監聽，ZKRebalancerListener監聽consumerIdString狀態變化

觸發consumer reblance條件如下幾個：

ZKRebalancerListener：當/kafka01/consumer/[consumer-group]/ids子節點變化時，會觸發

ZKTopicPartitionChangeListener：當該topic的partitions發生變化時，會觸發。

val topicPath = "/kafka01/brokers/topics" + "/" + "topic-1"
zkClient.subscribeDataChanges(topicPath, topicPartitionChangeListener)

原文：http://blog.csdn.net/lizhitao/article/details/38458631

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

kafka Consumer詳解

1.ZookeeperConsumer架構

2.消費者線程(consumer thread),隊列，拉取線程(fetch thread)三者之間關係

3.consumer消息線程以及隊列創建邏輯

推薦2款開源、美觀的WinForm UI控件庫

NET9 AspnetCore將整合OpenAPI的文檔生成功能而無需三方庫

DStream 生成 RDD 實例詳解

spark啓動過程sparkconf實例化

深刻理解HDFS工作機制

java與mysql的日期類型對應

Spark Streaming 實現思路與模塊概述

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結