MapR Streams Kafka API的批量大小问题

Question

您好，我正在使用Kafka MapRStream接收来自Mapr Streams主题的事件。

我正在尝试增加我的消费者的批量大小，但在一个批次中得到的邮件不超过30条消息！

单个事件的大小约为5000字节。如果事件较小，我将在一批中得到更多。

这是我的消费者：

的配置

public static void main( String[] args ) {
        final Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "");
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "batchSize");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
        props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 50000);
        props.put(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 26214400);
        props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 100 * 1024 * 1024);
        props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1000);


        Consumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList(TOPIC));
        long totalCount = 0;
        long start = System.currentTimeMillis();
        long countTimesNoMessages = 0;

        while (countTimesNoMessages < 10) {
            ConsumerRecords<String, String> records = consumer.poll(1000);
            totalCount += records.count();
            System.out.println(records.count());
            if (records.count() == 0) {
                countTimesNoMessages++;
            }
        }

        long end = System.currentTimeMillis();
        System.out.println((end - start) + " for " + totalCount + " messages");
    }

Answer 1

这些是可能的配置点。

https://mapr.com/docs/61/MapR_Streams/configuration-parameters.html

注意，fetch.max.bytes是总的最大值，所有分区上的sum(max.partition.fetch.bytes)不能超过fetch.max.bytes。

调整max.partition.fetch.bytes是正常的，因此从每个分区中轮询超过64Kb（默认），并且也会调整fetch.max.bytes，以便允许max.partition.fetch.bytes正常工作。

您可能不应该将批处理大小设置得太大。一旦轮询流的请求频率降低到每秒几百以下，您就不太可能获得额外的性能改进，并且如果线程失败，则很可能出现热点问题或大量重做工作。

MapR Streams Kafka API的批量大小问题

问题描述投票：0回答：1

1个回答

最新问题

MapR Streams Kafka API的批量大小问题

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1