简体   繁体   English

建议在长时间运行的过程中使用rx与众不同?

[英]Advisable to use rx distinct in long running process?

i am using rx distinct operator to filter external data stream based on a certain key within a long running process. 我在长时间运行的过程中使用rx区分运算符来基于某个键过滤外部数据流。

will this cause leak in the memory? 这会导致内存泄漏吗? Assuming a lot of different keys will be received. 假设将收到许多不同的密钥。 How does rx distinct operator keep track of previously received keys? rx区分运算符如何跟踪先前收到的密钥?

Should I use groupbyuntil with a duration selector instead? 我应该将groupbyuntil与持续时间选择器一起使用吗?

Observable.Distinct uses a HashSet internally. Observable.Distinct在内部使用HashSet Memory usage will be roughly proportional to the number of distinct Keys encountered. 内存使用情况将与遇到的不同键的数量大致成比例。 (AFAIK about 30*n bytes) (AFAIK大约30 * n字节)

GroupByUntil does something really different than Distinct . GroupByUntil所做的事情与Distinct确实有所Distinct
GroupByUntil (well) groups, whereas Distinct filters the elements of a stream. GroupByUntil (很好)分组,而Distinct过滤流的元素。

Not sure about the intended use, but if you just want to filter consecutive identical elements you need Observable.DistinctUntilChanged which has a memory footprint independent of the number of keys. 不确定预期用途,但是如果只想过滤连续的相同元素,则需要Observable.DistinctUntilChanged ,它的内存占用量与键的数量无关。

This may be a controversial tactic, but if you were worried about distinct keys accumulating, and if there was a point in time where this could safely be reset, you could introduce a reset policy using Observable.Switch. 这可能是一个有争议的策略,但是如果您担心会累积不同的密钥,并且在某个时间点可以安全地重置此键,则可以使用Observable.Switch引入重置策略。 For example, we have a scenario where the "state of the world" is reset on a daily basis, so we could reset the distinct observable daily. 例如,我们有一个场景,其中“世界状态”每天都会重置,因此我们可以每天重置不同的可观察到的东西。

Observable.Create<MyPoco>(
    observer =>
    {
        var distinctPocos = new BehaviorSubject<IObservable<MyPoco>>(pocos.Distinct(x => x.Id));

        var timerSubscription =
            Observable.Timer(
                new DateTimeOffset(DateTime.UtcNow.Date.AddDays(1)),
                TimeSpan.FromDays(1),
                schedulerService.Default).Subscribe(
                    t =>
                    {
                        Log.Info("Daily reset - resetting distinct subscription.");
                        distinctPocos.OnNext(pocos.Distinct(x => x.Id));
                    });

        var pocoSubscription = distinctPocos.Switch().Subscribe(observer);

        return new CompositeDisposable(timerSubscription, pocoSubscription);
    });

However, I do tend to agree with James World's comment above regarding testing with a memory profiler to check that memory is indeed an issue before introducing potentially unnecessary complexity. 但是,我确实同意上面的James World关于使用内存分析器进行测试以检查内存确实是一个问题,然后再引入潜在不必要的复杂性的评论。 If you're accumulating 32-bit ints as the key, you'd have many millions of unique items before running into memory issues on most platforms. 如果您以32位整数为键,那么在大多数平台上遇到内存问题之前,您将有数百万个唯一项。 Eg 262144 32-bit int keys will take up one megabyte. 例如262144 32位int密钥将占用一兆字节。 It may be that you reset the process long before this time, depending on your scenario. 根据您的情况,可能是您在此时间之前很长时间重置了该过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM