简体   繁体   English

IEnumerable与IObservable在幕后有何不同?

[英]How does IEnumerable differ from IObservable under the hood?

I'm curious as to how IEnumerable differs from IObservable under the hood. 我很好奇IEnumerableIObservable到底有何不同。 I understand the pull and push patterns respectively but how does C#, in terms of memory etc, notify subscribers (for IObservable) that it should receive the next bit of data in memory to process? 我分别理解了pull和push模式,但是在内存等方面,C#如何通知订户(对于IObservable)它应该接收内存中的下一部分数据进行处理? How does the observed instance know it's had a change in data to push to the subscribers. 被观察的实例如何知道它已发生数据更改以推送到订阅服务器。

My question comes from a test I was performing reading in lines from a file. 我的问题来自于我正在从文件中逐行读取的测试。 The file was about 6Mb in total. 该文件总共约为6Mb。

Standard Time Taken: 4.7s, lines: 36587 标准时间:4.7秒,行:36587

Rx Time Taken: 0.68s, lines: 36587 接收时间:0.68秒,行:36587

How is Rx able to massively improve a normal iteration over each of the lines in the file? Rx如何在文件的每一行上大幅度改善常规迭代?

private static void ReadStandardFile()
{
    var timer = Stopwatch.StartNew();
    var linesProcessed = 0;

    foreach (var l in ReadLines(new FileStream(_filePath, FileMode.Open)))
    {
        var s = l.Split(',');
        linesProcessed++;
    }

    timer.Stop();

    _log.DebugFormat("Standard Time Taken: {0}s, lines: {1}",
        timer.Elapsed.ToString(), linesProcessed);
}

private static void ReadRxFile()
{
    var timer = Stopwatch.StartNew();
    var linesProcessed = 0;

    var query = ReadLines(new FileStream(_filePath, FileMode.Open)).ToObservable();

    using (query.Subscribe((line) =>
    {
        var s = line.Split(',');
        linesProcessed++;
    }));

    timer.Stop();

    _log.DebugFormat("Rx Time Taken: {0}s, lines: {1}",
        timer.Elapsed.ToString(), linesProcessed);
}

private static IEnumerable<string> ReadLines(Stream stream)
{
    using (StreamReader reader = new StreamReader(stream))
    {
        while (!reader.EndOfStream)
            yield return reader.ReadLine();
    }
}

My hunch is the behavior you're seeing is reflecting the OS caching the file. 我的直觉是您所看到的行为反映了OS将文件缓存。 I would imagine if you reversed the order of the calls you would see a similar difference in speeds, just swapped. 我可以想象,如果您调换了呼叫顺序,那么您会看到类似的速度差异,只是交换了一下。

You could improve this benchmark by performing a few warm-up runs or by copying the input file to a temp file using File.Copy prior to testing each one. 您可以通过执行一些预热运行或通过在测试每个文件之前使用File.Copy将输入文件复制到临时文件中来提高此基准。 This way the file would not be "hot" and you would get a fair comparison. 这样,文件就不会“很热”,您将得到一个公平的比较。

I'd suspect that you're seeing some kind of internal optimization of the CLR. 我怀疑您正在看到CLR的某种内部优化。 It probably caches the content of the file in memory between the two calls so that ToObservable can pull the content much faster... 它可能在两次调用之间将文件的内容缓存在内存中,以便ToObservable可以更快地提取内容...

Edit: Oh, the good colleague with the crazy nickname eeh ... @sixlettervariables was faster and he's probably right: it's rather the OS who's optimizing than the CLR. 编辑:哦,拥有疯狂昵称eeh的好同事... @sixlettervariables更快,他可能是正确的:操作系统而不是CLR进行了优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM