简体   繁体   English

如何使用 C# Rx 实现 Redis 流

[英]How to implement Redis streams with C# Rx

Because I could not find any implementation where we don't use a loop to get the stream content I start to implement one but I'm facing several problems that may some of you can point me to the right place.因为我找不到任何不使用循环来获取 stream 内容的实现,所以我开始实现一个,但我遇到了几个问题,你们中的一些人可能会指出我正确的地方。

The implementation uses a combination of Pub/Sub and the stream: * log -> stream channel * log:notification -> pub/sub * log:lastReadMessage -> contains the last read key from the stream该实现使用 Pub/Sub 和 stream 的组合: * log -> stream 通道 * log:notification -> pub/sub * log:lastReadMessage -> 包含来自 ZF7B44ZCFAFD5C52223E7BZ 的最后读取密钥

Publisher出版商

        static async Task Main(string[] args)
        {
            var connectionMultiplexer = await ConnectionMultiplexer.ConnectAsync("localhost");
            var redisDb =  connectionMultiplexer.GetDatabase(1);

            while(true)
            {
                var value =  new NameValueEntry[]
                {
                    new NameValueEntry("id", Guid.NewGuid().ToString()),
                    new NameValueEntry("timestamp", DateTime.UtcNow.ToString())
                };

                redisDb.StreamAdd("log", value);
                var publisher = connectionMultiplexer.GetSubscriber();
                publisher.Publish("log:notify", string.Empty, CommandFlags.None);
                await Task.Delay(TimeSpan.FromSeconds(1));
            }
        }

Subscriber订户

        static async Task Main(string[] args)
        {
            var connectionMultiplexer = await ConnectionMultiplexer.ConnectAsync("localhost");
            var redisDb =  connectionMultiplexer.GetDatabase(1);


            var observableStream =  CreateTaskFromStream(connectionMultiplexer, redisDb, "log")
                .Subscribe(x => {
                  Console.WriteLine(x);  
                });

            Console.ReadLine();
        }
        private static SemaphoreSlim taskFromStreamBlocker = new SemaphoreSlim(1);

        private static IObservable<string> CreateTaskFromStream(ConnectionMultiplexer connection, IDatabase redisDb, string channel)
        {
            var lastReadMessage = "0-0";

            var lastReadMessageData = redisDb.StringGet($"{channel}:lastReadMessage", CommandFlags.None);
            if (string.IsNullOrEmpty(lastReadMessageData))
            {
                redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);
            }
            else
            {
                lastReadMessage = lastReadMessageData;
            }


            return Observable.Create<string>(obs => 
            {
                var subscriber = connection.GetSubscriber();
                subscriber.Subscribe($"{channel}:notify", async (ch, msg) => 
                {
                    var locker = await taskFromStreamBlocker
                        .WaitAsync(0)
                        .ConfigureAwait(false);

                    if (!locker)
                    {
                        return;
                    }

                    var messages = await redisDb.StreamReadAsync(channel, lastReadMessage);

                    foreach(var message in messages)
                    {
                        obs.OnNext($"{message.Id} -> {message.Values[0].Name}: {message.Values[0].Value} / {message.Values[1].Name}: {message.Values[1].Value}");
                        lastReadMessage = message.Id;
                    }

                    redisDb.KeyDelete($"{channel}:lastReadMessage");
                    redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);

                    taskFromStreamBlocker.Release();
                });

                return Disposable.Create(() => subscriber.Unsubscribe(channel));
            });
        }

Why the semaphore?为什么是信号量?

Because I could have lots of messages add to the stream and I don't want o to have the same message processed twice.因为我可以将大量消息添加到 stream 并且我不希望 o 处理相同的消息两次。

THE PROBLEMS问题

  1. If we have unprocessed messages in the stream, how can we process without having an event from the Pub/Sub When we start we can verify if it is unprocessed messages and processes it.如果我们在 stream 中有未处理的消息,我们如何在没有来自 Pub/Sub 的事件的情况下处理当我们开始时,我们可以验证它是否是未处理的消息并处理它。 If during this time a new message is added to the stream, and we aren't subscribing yet the Pub/sub, the subscriber will not process the message until we receive a notification through the Pub/Sub.如果在此期间向 stream 添加了一条新消息,而我们尚未订阅 Pub/sub,则订阅者将不会处理该消息,直到我们通过 Pub/Sub 收到通知。

  2. The semaphore is important to not process the same message twice but at the same time it's a curse.信号量很重要,不能两次处理相同的消息,但同时它是一个诅咒。 During the process of a message, another can be added to the stream.在一个消息的处理过程中,另一个可以添加到 stream 中。 When that happens the subscriber will not process right away but only the next time it's notified (at this point will process two messages).当这种情况发生时,订阅者不会立即处理,而只会在下次通知时处理(此时将处理两条消息)。

How you would implement this?你将如何实现这一点? Is there an implementation of the Redis streams using Rx only?是否仅使用 Rx 实现 Redis 流? The solution should not use some kind of loop and be memory efficient.该解决方案不应使用某种循环,并且 memory 高效。 Is this possible?这可能吗?

Best wishes最好的祝愿

Paulo Aboim Pinto保罗·阿博伊姆·平托

this is the solution with WHILE that I want to avoid这是我想避免的 WHILE 解决方案

        private static IObservable<string> CreateTaskFromStream(ConnectionMultiplexer connection, IDatabase redisDb, string channel, CancellationToken cancellationToken)
        {
            var lastReadMessage = "0-0";

            var lastReadMessageData = redisDb.StringGet($"{channel}:lastReadMessage", CommandFlags.None);
            if (string.IsNullOrEmpty(lastReadMessageData))
            {
                redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);
            }
            else
            {
                lastReadMessage = lastReadMessageData;
            }

            return Observable.Create<string>(async obs => 
            {
                while(!cancellationToken.IsCancellationRequested)
                {
                    var messages = await redisDb.StreamReadAsync(channel, lastReadMessage);

                    foreach(var message in messages)
                    {
                        obs.OnNext($"{message.Id} -> {message.Values[0].Name}: {message.Values[0].Value} / {message.Values[1].Name}: {message.Values[1].Value}");
                        lastReadMessage = message.Id;
                    }

                    redisDb.KeyDelete($"{channel}:lastReadMessage");
                    redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);

                    await Task.Delay(TimeSpan.FromMilliseconds(500));
                }

                return Disposable.Empty;
            });
        }

and this is another solution using a timer with 200ms elapse time这是使用具有 200 毫秒运行时间的计时器的另一种解决方案


        private static IObservable<string> CreateTaskFromStream(ConnectionMultiplexer connection, IDatabase redisDb, string channel, CancellationToken cancellationToken)
        {
            var lastReadMessage = "0-0";

            var lastReadMessageData = redisDb.StringGet($"{channel}:lastReadMessage", CommandFlags.None);
            if (string.IsNullOrEmpty(lastReadMessageData))
            {
                redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);
            }
            else
            {
                lastReadMessage = lastReadMessageData;
            }

            var instance = ThreadPoolScheduler.Instance;

            return Observable.Create<string>(obs => 
            {
                var disposable = Observable
                    .Interval(TimeSpan.FromMilliseconds(200), instance)
                    .Subscribe(async _ => 
                    {
                        var messages = await redisDb.StreamReadAsync(channel, lastReadMessage);

                        foreach(var message in messages)
                        {
                            obs.OnNext($"{message.Id} -> {message.Values[0].Name}: {message.Values[0].Value} / {message.Values[1].Name}: {message.Values[1].Value}");
                            lastReadMessage = message.Id;
                        }

                        redisDb.KeyDelete($"{channel}:lastReadMessage");
                        redisDb.StringGetSet($"{channel}:lastReadMessage", lastReadMessage);
                    });
                cancellationToken.Register(() => disposable.Dispose());

                return Disposable.Empty;    
            });
       }

I use a tight loop just do an XRange and save a position - KISS.. but if there is no work it backs off so its pretty fast when there is a lot going on its a tight loop.我使用紧密循环只是做一个XRange并保存一个position - KISS ..但是如果没有工作它会后退,所以当它的紧密循环发生很多事情时它会非常快。

If you need higher performance eg reading while processing however i would caution against this for most cases.如果您需要更高的性能,例如在处理时读取,但是在大多数情况下我会提醒您不要这样做。

  1. It creates a lot of complexity and this needs to be rock solid.它创造了很多复杂性,这需要坚如磐石。
  2. Redis is normally fast enough Redis 通常足够快
  3. " I don't want o to have the same message processed twice." “我不想让同一条消息被处理两次。” almost every system has at least once delivery eliminating this around crashes is mind mindbogglingly difficult / slow.几乎每个系统都至少有一次交付,以消除崩溃周围的这种情况令人难以置信的困难/缓慢。 You can remove it partially by using a hashset of ids but its pretty trivial for consumers to deal with it and messages designed to be idempotent.您可以通过使用 id 的哈希集来部分删除它,但对于消费者来说处理它和设计为幂等的消息非常简单。 This is probably the root cause message design issues.这可能是消息设计问题的根本原因。 If you partition each reader ( separate stream and 1 worker per stream) you can keep the hashset in memory avoiding scaling / distributed issues.如果您对每个阅读器进行分区(单独的 stream 和每个流 1 个工作人员),您可以将哈希集保留在 memory 中,以避免扩展/分布式问题。 Note a Redis stream can preserve order use this to make simpler idempotent messages.注意 Redis stream 可以保留顺序使用它来制作更简单的幂等消息。
  4. Exceptions, you don't want to stop processing a stream because a consumer has a logic exception on 1 message eg get a call at night the whole system has stopped, locks make this worse.异常,您不想停止处理 stream 因为消费者在 1 条消息上有逻辑异常,例如在晚上接到电话整个系统已经停止,锁使情况变得更糟。 Event data cant be changed its happened so its best effort.事件数据无法更改,因此请尽最大努力。 However infra / redis exceptions do need to throw and be retried.但是 infra / redis 异常确实需要抛出并重试。 Managing this outside a loop is very painful.在循环之外管理这个是非常痛苦的。
  5. Simple back pressure.简单的背压。 If you cant process the work fast enough the loop slows down instead of creating a lot of tasks and blowing up all your memory.如果您不能足够快地处理工作,那么循环会减慢而不是创建大量任务并炸毁所有 memory。

I dont use distributed locks / semaphores anymore.我不再使用分布式锁/信号量。

If your dealing with Commands eg dosomething instead of xyz has happened these can fail.如果您处理命令,例如 dosomething 而不是 xyz,这些可能会失败。 Again the consumer should deal with the case it has already happened not the redis / stream reading part.消费者应该再次处理已经发生的情况,而不是 redis / stream 阅读部分。

Some libs with magic call backs dont solve these issues the call backs will have retry when time out run on any node etc. The complexity / issues are still there they just move somewhere else.一些具有魔术回调的库不能解决这些问题,当在任何节点上运行超时等时,回调将重试。复杂性/问题仍然存在,它们只是移到了其他地方。

You may have an observable on top for consumers but this is basically cosmetic it does not solve the problem and if you look under many implementations somewhere you will see the same loop.您可能在消费者的顶部有一个可观察的,但这基本上是装饰性的,它不能解决问题,如果您在某个地方查看许多实现,您会看到相同的循环。 I would not use this instead get the consumer to register an action.我不会使用它来让消费者注册一个动作。

eg例如

    public interface IStreamSubscriber
    {
        void RegisterEventCallBack(Func<object, IReadOnlyDictionary<string, string>, Task> callback);
        void RegisterBatchEventCallBack(Func<IEnumerable<(object msg, IReadOnlyDictionary<string, string> metaData)>, Task> batchCallback);
        void Start();
    }    

In your case the call back could have the observable and not use the loop but there is a low level loop underneath which can also do message to object conversion for the consumer.在您的情况下,回调可能有可观察的并且不使用循环,但是下面有一个低级循环,它也可以为消费者执行到 object 转换的消息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM