[英]Performance of seq<int> vs Lazy<LazyList<int>> in F#
There is a well known solution for generating an infinite stream of Hamming numbers (ie all positive integers n
where n = 2^i * 3^j * 5^k
). 存在用于生成无限汉明数的流的公知解决方案(即,所有正整数
n
,其中n = 2^i * 3^j * 5^k
)。 I have implemented this in two different ways in F#. 我在F#中以两种不同的方式实现了这一点。 The first method uses
seq<int>
. 第一种方法使用
seq<int>
。 The solution is elegant, but the performance is terrible. 解决方案很优雅,但性能很糟糕。 The second method uses a custom type where the tail is wrapped in
Lazy<LazyList<int>>
. 第二种方法使用自定义类型,其中尾部包装在
Lazy<LazyList<int>>
。 The solution is clunky, but the performance is amazing. 解决方案很笨重,但性能却令人惊叹。
Can someone explain why the performance using seq<int>
is so bad and if there is a way to fix it? 有人可以解释为什么使用
seq<int>
的性能是如此糟糕,如果有办法解决它? Thanks. 谢谢。
Method 1 using seq<int>
. 方法1使用
seq<int>
。
// 2-way merge with deduplication
let rec (-|-) (xs: seq<int>) (ys: seq<int>) =
let x = Seq.head xs
let y = Seq.head ys
let xstl = Seq.skip 1 xs
let ystl = Seq.skip 1 ys
if x < y then seq { yield x; yield! xstl -|- ys }
elif x > y then seq { yield y; yield! xs -|- ystl }
else seq { yield x; yield! xstl -|- ystl }
let rec hamming: seq<int> = seq {
yield 1
let xs = Seq.map ((*) 2) hamming
let ys = Seq.map ((*) 3) hamming
let zs = Seq.map ((*) 5) hamming
yield! xs -|- ys -|- zs
}
[<EntryPoint>]
let main argv =
Seq.iter (printf "%d, ") <| Seq.take 100 hamming
0
Method 2 using Lazy<LazyList<int>>
. 方法2使用
Lazy<LazyList<int>>
。
type LazyList<'a> = Cons of 'a * Lazy<LazyList<'a>>
// Map `f` over an infinite lazy list
let rec inf_map f (Cons(x, g)) = Cons(f x, lazy(inf_map f (g.Force())))
// 2-way merge with deduplication
let rec (-|-) (Cons(x, f) as xs) (Cons(y, g) as ys) =
if x < y then Cons(x, lazy(f.Force() -|- ys))
elif x > y then Cons(y, lazy(xs -|- g.Force()))
else Cons(x, lazy(f.Force() -|- g.Force()))
let rec hamming =
Cons(1, lazy(let xs = inf_map ((*) 2) hamming
let ys = inf_map ((*) 3) hamming
let zs = inf_map ((*) 5) hamming
xs -|- ys -|- zs))
[<EntryPoint>]
let main args =
let a = ref hamming
let i = ref 0
while !i < 100 do
match !a with
| Cons (x, f) ->
printf "%d, " x
a := f.Force()
i := !i + 1
0
Ganesh is right in that you're evaluating the sequence multiple times. Ganesh是正确的,因为你正在多次评估序列。
Seq.cache
will help improve performance, but you get much better performance out of LazyList
because the underlying sequence is only ever evaluated once then cached, so it can be traversed much more rapidly. Seq.cache
将有助于提高性能,但是您可以从LazyList
获得更好的性能,因为基础序列只会被评估一次然后被缓存,因此它可以更快地遍历。 In fact, this is a good example of where LazyList
should be used over a normal seq
. 实际上,这是一个很好的例子,说明
LazyList
应该在普通seq
。
It also looks like there is some significant overhead introduced by your use of Seq.map
here. 看起来你在这里使用
Seq.map
引入了一些重大的开销。 I believe the compiler is allocating a closure each time it's called there. 我相信编译器每次调用时都会分配一个闭包。 I changed your
seq
based code to use seq
-expressions there instead, and it's about 1/3 faster than the original for the first 40 numbers in the sequence: 我将基于
seq
的代码更改为使用seq
-expressions代替,并且它比序列中前40个数字的原始代码快1/3:
let rec hamming: seq<int> = seq {
yield 1
let xs = seq { for x in hamming do yield x * 2 }
let ys = seq { for x in hamming do yield x * 3 }
let zs = seq { for x in hamming do yield x * 5 }
yield! xs -|- ys -|- zs
}
My ExtCore library includes a lazyList
computation builder which works just like seq
, so you can simplify your code like this: 我的ExtCore库包含一个
lazyList
计算构建器,它就像seq
一样工作,因此您可以像这样简化代码:
// 2-way merge with deduplication
let rec (-|-) (xs: LazyList<'T>) (ys: LazyList<'T>) =
let x = LazyList.head xs
let y = LazyList.head ys
let xstl = LazyList.skip 1 xs
let ystl = LazyList.skip 1 ys
if x < y then lazyList { yield x; yield! xstl -|- ys }
elif x > y then lazyList { yield y; yield! xs -|- ystl }
else lazyList { yield x; yield! xstl -|- ystl }
let rec hamming : LazyList<uint64> = lazyList {
yield 1UL
let xs = LazyList.map ((*) 2UL) hamming
let ys = LazyList.map ((*) 3UL) hamming
let zs = LazyList.map ((*) 5UL) hamming
yield! xs -|- ys -|- zs
}
[<EntryPoint>]
let main argv =
let watch = Stopwatch.StartNew ()
hamming
|> LazyList.take 2000
|> LazyList.iter (printf "%d, ")
watch.Stop ()
printfn ""
printfn "Elapsed time: %.4fms" watch.Elapsed.TotalMilliseconds
System.Console.ReadKey () |> ignore
0 // Return an integer exit code
(NOTE: I also made your (-|-)
function generic, and modified hamming
to use 64-bit unsigned ints because 32-bit signed ints overflow after a bit). (注意:我还使你的
(-|-)
函数通用,并修改hamming
使用64位无符号整数,因为32位有符号整数后溢出。) This code runs through the first 2000 elements of the sequence on my machine in ~450ms; 这段代码在我的机器上运行序列的前2000个元素~45ms; the first 10000 elements takes ~3500ms.
前10000个元素需要~3500ms。
Your seq
for hamming
is re-evaluated from the beginning on each recursive call. 在每次递归调用时,都会从头开始重新评估
hamming
seq
。 Seq.cache
is some help: Seq.cache
是一些帮助:
let rec hamming: seq<int> =
seq {
yield 1
let xs = Seq.map ((*) 2) hamming
let ys = Seq.map ((*) 3) hamming
let zs = Seq.map ((*) 5) hamming
yield! xs -|- ys -|- zs
} |> Seq.cache
However as you point out the LazyList
is still much better on large inputs, even if every single sequence is cached. 但是,正如您指出的那样,即使每个序列都被缓存,
LazyList
在大输入上仍然要好得多。
I'm not entirely certain why they differ by more than a small constant factor, but perhaps it's better to just focus on making the LazyList
less ugly. 我不完全确定为什么它们的区别不仅仅是一个小的常数因子,但也许最好只关注使
LazyList
不那么难看。 Writing something to convert it to a seq
makes processing it much nicer: 写一些东西将其转换为
seq
会使处理得更好:
module LazyList =
let rec toSeq l =
match l with
| Cons (x, xs) ->
seq {
yield x
yield! toSeq xs.Value
}
You can then use your simple main
directly. 然后,您可以直接使用简单的
main
。 It's also not really necessary to use mutation to process the LazyList
, you could just do so recursively. 使用变异来处理
LazyList
也没有必要,你可以递归地这样做。
The definition doesn't look so bad though the lazy
and Force()
do clutter it up a bit. 虽然
lazy
和Force()
会使它混乱,但定义看起来并不那么糟糕。 That looks marginally better if you use .Value
instead of .Force()
. 如果使用
.Value
而不是.Force()
那看起来会略微好一点。 You could also define a computation builder for LazyList
similar to the seq
one to recover the really nice syntax, though I'm not sure it's worth the effort. 您还可以为
LazyList
定义一个类似于seq
的计算构建器来恢复非常好的语法,尽管我不确定这是值得的。
Here is a sequence base version with better performance. 这是一个具有更好性能的序列库版本。
let hamming =
let rec loop nextHs =
seq {
let h = nextHs |> Set.minElement
yield h
yield! nextHs
|> Set.remove h
|> Set.add (h*2) |> Set.add (h*3) |> Set.add (h*5)
|> loop
}
Set.empty<int> |> Set.add 1 |> loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.