kafka消費者分組的時候,分區現象

首先創建Consumer需要的配置信息,最基本的有五個信息:

  • Kafka集羣的地址。
  • 發送的Message中Key的序列化方式。
  • 發送的Message中Value的序列化方式。
  • 指定Consumer Group。
  • 指定拉取Message範圍的策略。
    Properties properties = new Properties();
    properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "IP:Port");
    properties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
    properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
    properties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "consumer_group_1");
    properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest"); // earliest, none

然後傳入上面實例化好的配置信息,實例化Consumer:

KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);

然後通過Consumer的 subscribe(Collection<String> topics) 方法訂閱Topic:

consumer.subscribe(Arrays.asList("first_topic"));

最後獲取Topic裏的Message,將Message信息輸出到日誌中:

while(true) {
	ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
	for(ConsumerRecord<String, String> record : records) {
		logger.info("Key: " + record.key() + ", Value: " + record.value());
		logger.info("Partition: " + record.partition() + ", Offset: " + record.offset());
	}
}

Consumer的 poll(Duration timeout) 方法可以設置獲取數據的時間間隔,同時回憶一下在之前Consumer章節的 Consumer Poll Options 小節中,說過關於Consumer獲取Message的四個配置項,都可以在Properties裏進行設置。

啓動Java Consumer後,在控制檯可以看到如下信息:

[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.0.0
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : 3402a8361b734732
[main] INFO org.apache.kafka.clients.Metadata - Cluster ID: 4nh_0r5iQ_KsR_Fzf1HTGg
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Discovered group coordinator IP:9092 (id: 2147483647 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Revoking previously assigned partitions []
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] (Re-)joining group
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Successfully joined group with generation 1
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Setting newly assigned partitions [first_topic-0, first_topic-1, first_topic-2]
[main] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=consumer-1, groupId=consumer_group_1] Resetting offset for partition first_topic-0 to offset 23.
[main] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=consumer-1, groupId=consumer_group_1] Resetting offset for partition first_topic-1 to offset 24.
[main] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=consumer-1, groupId=consumer_group_1] Resetting offset for partition first_topic-2 to offset 21.

在上面的信息中,可以看到 Setting newly assigned partitions [first_topic-0, first_topic-1, first_topic-2] 這句話,說明當前這個Consumer會獲取 first_topic 這個Topic中全部Partition中的Message。

如果我們再啓動一個Consumer,這個Consumer和第一個在同一個組裏,看看會有什麼輸出信息:

[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.0.0
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : 3402a8361b734732
[main] INFO org.apache.kafka.clients.Metadata - Cluster ID: 4nh_0r5iQ_KsR_Fzf1HTGg
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Discovered group coordinator IP:9092 (id: 2147483647 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Revoking previously assigned partitions []
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] (Re-)joining group
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Successfully joined group with generation 2
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Setting newly assigned partitions [first_topic-2]

可以看到新啓動的Consumer會輸出 Setting newly assigned partitions [first_topic-2] 這句話,說明新的這個Consumer只會獲取 first_topic 這個Topic的一個Partition中的Message。

再回去看看第一個Consumer的控制檯:

[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Attempt to heartbeat failed since group is rebalancing
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Revoking previously assigned partitions [first_topic-0, first_topic-1, first_topic-2]
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] (Re-)joining group
[main] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Successfully joined group with generation 2
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-1, groupId=consumer_group_1] Setting newly assigned partitions [first_topic-0, first_topic-1]

第一個Consumer新輸出在控制檯中的信息很關鍵,首先看到 Attempt to heartbeat failed since group is rebalancing 這句話,說明Kafka會自動重新給Consumer Group裏的Consumer分配Topic的Partition。

再看 Setting newly assigned partitions [first_topic-0, first_topic-1] 這句,說明第一個Consumer不會再獲取 first_topic-2 這個Partition裏的Message了。這也印證了在Consumer章節的 Consumer Group 小節裏講過的概念。

Java Consumer with Assign and Seek

如果我們有一個臨時的Consumer,不想加入任何一個Consumer Group,而且需要指定Topic的Partition,以及指定從哪個Message Offset開始獲取數據,怎麼辦?所幸,Kafka提供了這樣的API。

首先我們在實例化配置信息時,就不需要指定Consumer Group了:

Properties properties = new Properties();
properties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, KafkaConstant.BOOTSTRAP_SERVER);
properties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // earliest, none

然後實例化 TopicPartition ,指定Topic和Partition序號。使用Consumer的 assign(Collection<TopicPartition> partitions) 方法,分配給該Consumer:

TopicPartition topicPartition = new TopicPartition("first_topic", 0);
consumer.assign(Arrays.asList(topicPartition));

再然後指定Message Offset:

long offset = 21L;
consumer.seek(topicPartition, offset);

運行該Consumer,可以看到如下輸出信息:

[main] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=consumer-1, groupId=] Fetch offset 21 is out of range for partition first_topic-0, resetting offset
[main] INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=consumer-1, groupId=] Resetting offset for partition first_topic-0 to offset 22.
[main] INFO com.devtalking.jacefu.kafka.tutorial.ConsumerDemoAssignSeek - Key: null, Value: hello world!
[main] INFO com.devtalking.jacefu.kafka.tutorial.ConsumerDemoAssignSeek - Partition: 0, Offset: 22

如果我們使用Consumer Group CLI查看,會發現這種操作其實也是臨時創建了一個Consumer Group:

root@iZ2ze2booskait1cxxyrljZ:~# kafka-consumer-groups.sh --bootstrap-server 127.0.0.1:9092 --list

consumer_group_1
KMOffsetCache-iZ2ze2booskait1cxxyrljZ
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章