简体   繁体   English

如何使用分区以使用 .NET Core C# 在 kafka 中并行使用一个主题?

[英]How to use partitions in order to parallel consume one topic in kafka with .NET Core C#?

We are using the .NET Kafka client to consume messages from one topic in a C# code.我们使用 .NET Kafka 客户端在 C# 代码中使用来自一个主题的消息。 However, it seems to be a wee bit too slow.然而,它似乎有点太慢了。

Wondering if we could parallelize the process a bit, so I checked this answer there: Kafka how to consume one topic parallel想知道我们是否可以稍微并行化这个过程,所以我在那里检查了这个答案: Kafka 如何并行使用一个主题

But I don't really see how to implement this partition thing with the .NET Kafka client in my example below:但是在下面的示例中,我并没有真正看到如何使用 .NET Kafka 客户端来实现这个分区:

var consumerBuilder = new ConsumerBuilder<Ignore, string>(GetConfig())
    .SetErrorHandler((_, e) => _logger.LogError("Kafka consumer error on Revenue response. {@KafkaConsumerError}", e));

using (var consumer = consumerBuilder.Build())
{
    consumer.Subscribe(RevenueResponseTopicName);

    try
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            var consumeResult = consumer.Consume(stoppingToken);

            RevenueTopicResponseModel revenueResponse;
            try
            {
                revenueResponse = JsonConvert.DeserializeObject<RevenueTopicResponseModel>(consumeResult.Value);
            }
            catch
            {
                _logger.LogCritical("Impossible to deserialize the response. {@RevenueConsumeResult}", consumeResult);
                continue;
            }
            _logger.LogInformation("Revenue response received from Kafka. {RevenueTopicResponse}",
                consumeResult.Value);

            await _revenueService.RevenueResultReceivedAsync(revenueResponse);
        }
    }
    catch (OperationCanceledException)
    {
        _logger.LogInformation($"Operation canceled. Closing {nameof(RevenueResponseConsumer)}.");
        consumer.Close();
    }
    catch (Exception e)
    {
        _logger.LogCritical(e, $"Unhandled exception during {nameof(RevenueResponseConsumer)}.");
    }
}

You need to create topic with multiple partitions, let's say 10. In your code create 10 consumers with the same Consumer Group - brokers will distribute topic messages among your consumers.您需要创建具有多个分区的主题,假设为 10。在您的代码中,使用相同的消费者组创建 10 个消费者 - 代理将在您的消费者之间分发主题消息。

Basically, just put your code inside for loop:基本上,只需将您的代码放入for循环中:

for (int i = 0; i < 10; i++)
{
    var consumerBuilder = new ConsumerBuilder<Ignore, string>(GetConfig())
    .SetErrorHandler((_, e) => _logger.LogError("Kafka consumer error on Revenue response. {@KafkaConsumerError}", e));

    using (var consumer = consumerBuilder.Build())
    {
        // your processing here
    }
}

In order to answer to this question correctly we need to know what is the reason behind this requirement to partitioning.为了正确回答这个问题,我们需要知道这个分区要求背后的原因是什么。

If your topic doesn't have lots of messages to be processed then it's not the case to use partitioning.如果您的主题没有大量要处理的消息,则不能使用分区。 If the issue is that a single message processing tooks too much time and you want parallelize the work, then you could add consumed messages to a Channel and have as many consumers of that channel as needed in background.如果问题是单个消息处理花费了太多时间并且您希望并行化工作,那么您可以将消费的消息添加到Channel并在后台根据需要拥有该通道的尽可能多的使用者。

Basically you should still use a single consumer per process since a consumer utilizes threads in background基本上你仍然应该为每个进程使用一个消费者,因为消费者在后台使用线程

Also you may find my consideration about Kafka Consumer in C# in the article您也可以在文章中找到我对 C# 中Kafka Consumer 的考虑

If you have any questions, please feel free to ask!如果您有任何问题,请随时提问! I'll be glad to help you我很乐意帮助你

You can commit after a set of offsets instead of committing on each offset, which could give you some performance benefit.您可以在一组偏移后提交,而不是在每个偏移上提交,这可以为您带来一些性能优势。

if( result.offset % 5 == 0)
{
   consumer.Commit(result)
}

Assuming EnableAutoCommit = false假设 EnableAutoCommit = false

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM