简体   繁体   English

F# 序列行为

[英]F# seq behavior

I'm a little baffled about the inner work of the sequence expression in F#.我对 F# 中序列表达式的内部工作感到有些困惑。

Normally if we make a sequential file reader with seq with no intentional caching of data通常,如果我们使用 seq 制作一个顺序文件读取器,而不会有意缓存数据

 seq { 
       let mutable current = file.Read()
       while current <> -1 do
           yield current
     }

We will end up with some weird behavior if we try to do some re-iterate or backtracking, My Idea of this was, since Read() is a function calling some mutable value we can't expect the output to be correct if we re-iterate.如果我们尝试进行一些重新迭代或回溯,我们最终会出现一些奇怪的行为,我的想法是,因为 Read() 是一个 function 调用一些可变值我们不能指望 output 是正确的,如果我们重新-迭代。 But then this behaves nicely even on boundary reading?但是,即使在边界阅读上,这也表现得很好?

let Read path =
    seq {
        use fp = System.IO.File.OpenRead path
        let buf = [| for _ in 0 .. 1024 -> 0uy |]
        let mutable pos = 1
        let mutable current = 0
        while pos <> 0 do
            if current = 0 then
                pos <- fp.Read(buf, 0, 1024)
            if pos > 0 && current < pos then
                yield buf.[current]
                current <- (current + 1) % 1024 
   } 

 let content = Read "some path" 

We clearly use the same buffer to enhance performance, but assuming that we read the 1025 byte, it will trigger an update to the buffer, if we then try to read any byte with position < 1025 after we still get the correct output.我们显然使用相同的缓冲区来提高性能,但是假设我们读取 1025 字节,它将触发对缓冲区的更新,如果我们在仍然得到正确的 output 之后尝试读取 position < 1025 的任何字节。 How can that be and what are the difference?那怎么可能,有什么区别?

Your question is a bit unclear, so I'll try to guess.你的问题有点不清楚,所以我会尝试猜测。

When you create a seq { } , you're essentially creating a state machine which will run only as far as it needs to.当您创建一个seq { }时,您实际上是在创建一个 state 机器,它只会在需要时运行。 When you request the very first element from it, it'll start at the top and run until your first yield instruction.当您向它请求第一个元素时,它将从顶部开始并运行直到您的第一个yield指令。 Then, when you request another value, it'll run from that point until the next yield , and so on.然后,当您请求另一个值时,它将从该点运行直到下一个yield ,依此类推。

Keep in mind that a seq { } produces an IEnumerable<'T> , which is like a "plan of execution".请记住,一个seq { }产生一个IEnumerable<'T> ,这就像一个“执行计划”。 Each time you start to iterate the sequence (for example by calling Seq.head ), a call to GetEnumerator is made behind the scenes, which causes a new IEnumerator<'T> to be created.每次您开始迭代序列时(例如通过调用Seq.head ),都会在后台调用GetEnumerator ,这会导致创建一个新的IEnumerator<'T> It is the IEnumerator which does the actual providing of values.实际提供值的是IEnumerator You can think of it in more classical terms as having an array over which you can iterate (an iterable or enumerable ) and many pointers over that array, each of which are at different points in the array (many iterators or enumerator s).您可以用更经典的术语将其视为有一个可以迭代的数组(一个 iterable 或enumerable )和该数组上的许多指针,每个指针都位于数组中的不同点(许多迭代器或enumerator s)。

In your first code, file is most likely external to the seq block.在您的第一个代码中, file很可能在seq块之外。 This means that the file you are reading from is baked into the plan of execution ;这意味着您正在读取的文件已被纳入执行计划 no matter how many times you start to iterate the sequence, you'll always be reading from the same file.无论您开始迭代序列多少次,您都将始终从同一个文件中读取。 This is obviously going to cause unpredictable behaviour.这显然会导致不可预测的行为。

However, in your second code, the file is opened as part of the seq block's definition.但是,在您的第二个代码中,该文件作为seq块定义的一部分打开。 This means that you'll get a new file handle each time you iterate the sequence or, essentially, a new file handle per enumerator .这意味着每次迭代序列时都会获得一个新的文件句柄,或者本质上,每个enumerator都会获得一个新的文件句柄。 The reason this code works is that you can't reverse an enumerator or iterate over it multiple times, not with a single thread at least.此代码有效的原因是您不能反转枚举数或对其进行多次迭代,至少不能使用单个线程。

(Now, if you were to manually get an enumerator and advance it over multiple threads, you'd probably run into problems very quickly. But that is a different topic.) (现在,如果您要手动获取一个枚举器并将其推进多个线程,您可能很快就会遇到问题。但这是一个不同的主题。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM