如何在 .NET Core 上正確實現 kafka 消費者作為后台服務

Question

我通過在 .NET Core 2.2 上使用 BackgroundService 將 Kafka 使用者實現為控制台應用程序。 我使用 confluent-kafka-dotnet v1.0.1.1 作為 Apache Kafka 的客戶端。 我不太確定如何處理每條消息。

由於處理每條消息可能需要一些時間（最多 24 小時），因此我為每條消息啟動了一個新任務，這樣我就不會阻止消費者使用新消息。 我認為如果我的消息太多，每次創建一個新任務並不是正確的方法。 那么處理每條消息的正確方法是什么？ 是否可以為每條消息創建某種動態后台服務？
如果一條消息已經在處理中，但應用程序崩潰或發生重新平衡，我最終會多次使用和處理相同的消息。 我應該自動提交偏移量（或在它被消耗后立即提交）並將消息（或任務）的狀態存儲在某個地方，比如在數據庫中？

我知道有 Hangfire，但我不確定是否需要使用它。 如果我目前的方法完全錯誤，請給我一些建議。

下面是 ConsumerService 的實現：

public class ConsumerService : BackgroundService
{
    private readonly IConfiguration _config;
    private readonly IElasticLogger _logger;
    private readonly ConsumerConfig _consumerConfig;
    private readonly string[] _topics;
    private readonly double _maxNumAttempts;
    private readonly double _retryIntervalInSec;

    public ConsumerService(IConfiguration config, IElasticLogger logger)
    {
        _config = config;
        _logger = logger;
        _consumerConfig = new ConsumerConfig
        {
            BootstrapServers = _config.GetValue<string>("Kafka:BootstrapServers"),
            GroupId = _config.GetValue<string>("Kafka:GroupId"),
            EnableAutoCommit = _config.GetValue<bool>("Kafka:Consumer:EnableAutoCommit"),
            AutoOffsetReset = (AutoOffsetReset)_config.GetValue<int>("Kafka:Consumer:AutoOffsetReset")
        };
        _topics = _config.GetValue<string>("Kafka:Consumer:Topics").Split(',');
        _maxNumAttempts = _config.GetValue<double>("App:MaxNumAttempts");
        _retryIntervalInSec = _config.GetValue<double>("App:RetryIntervalInSec");
    }

    protected override Task ExecuteAsync(CancellationToken stoppingToken)
    {
        Console.WriteLine("!!! CONSUMER STARTED !!!\n");
        
        // Starting a new Task here because Consume() method is synchronous
        var task = Task.Run(() => ProcessQueue(stoppingToken), stoppingToken);

        return task;
    }

    private void ProcessQueue(CancellationToken stoppingToken)
    {
        using (var consumer = new ConsumerBuilder<Ignore, Request>(_consumerConfig).SetValueDeserializer(new MessageDeserializer()).Build())
        {
            consumer.Subscribe(_topics);

            try
            {
                while (!stoppingToken.IsCancellationRequested)
                {
                    try
                    {
                        var consumeResult = consumer.Consume(stoppingToken);

                        // Don't want to block consume loop, so starting new Task for each message  
                        Task.Run(async () =>
                        {
                            var currentNumAttempts = 0;
                            var committed = false;

                            var response = new Response();

                            while (currentNumAttempts < _maxNumAttempts)
                            {
                                currentNumAttempts++;

                                // SendDataAsync is a method that sends http request to some end-points
                                response = await Helper.SendDataAsync(consumeResult.Value, _config, _logger);

                                if (response != null && response.Code >= 0)
                                {
                                    try
                                    {
                                        consumer.Commit(consumeResult);
                                        committed = true;
                                        
                                        break;
                                    }
                                    catch (KafkaException ex)
                                    {
                                        // log
                                    }
                                }
                                else
                                {
                                    // log
                                }
                                
                                if (currentNumAttempts < _maxNumAttempts)
                                {
                                    // Delay between tries
                                    await Task.Delay(TimeSpan.FromSeconds(_retryIntervalInSec));
                                }
                            }
                                                    
                            if (!committed)
                            {
                                try
                                {
                                    consumer.Commit(consumeResult);
                                }
                                catch (KafkaException ex)
                                {
                                    // log
                                }
                            }
                        }, stoppingToken);
                    }
                    catch (ConsumeException ex)
                    {
                        // log
                    }
                }
            }
            catch (OperationCanceledException ex)
            {
                // log
                consumer.Close();
            }
        }
    }
}

Answer 1

同意 Fabio 的觀點，您不應該使用Task.Run來處理消息，因為您最終將有大量線程浪費資源並切換它們的執行，從而影響性能。

此外，在同一個線程中處理消費的消息是可以的，因為 Kafka 使用拉模型並且您的應用程序可以按照自己的節奏處理消息。

關於不止一次處理消息，我建議存儲已處理消息的偏移量，以便跳過已處理的消息。 由於 offset 是一個長基數，因此您可以輕松跳過偏移量小於之前提交的消息。 當然，這僅在您有一個分區時才有效，因為 Kafka 在分區級別提供偏移計數器和順序保證

您可以在我的文章中找到 Kafka Consumer 的示例。 如果您有任何問題，請隨時提問，我很樂意為您提供幫助

如何在 .NET Core 上正確實現 kafka 消費者作為后台服務

問題描述

1 個解決方案

解決方案1
0 2021-01-18 15:52:27

如何在 .NET Core 上正確實現 kafka 消費者作為后台服務

問題描述

1 個解決方案

解決方案1 0 2021-01-18 15:52:27

解決方案1
0 2021-01-18 15:52:27