基准测试过滤器和分区

Question

I was testing the performance of the partition function for lists and got some strange results, I think. 我认为，我正在测试列表的partition功能的性能并得到一些奇怪的结果。

We have that partition p xs == (filter p xs, filter (not . p) xs) but we chose the first implementation because it only performs a single traversal over the list. 我们有partition p xs == (filter p xs, filter (not . p) xs)但是我们选择了第一个实现，因为它只对列表执行一次遍历。 Yet, the results I got say that it maybe be better to use the implementation that uses two traversals. 然而，我得到的结果表明，使用使用两次遍历的实现可能更好。

Here is the minimal code that shows what I'm seeing 这是显示我所看到的最小代码

import Criterion.Main
import System.Random
import Data.List (partition)

mypartition :: (a -> Bool) -> [a] -> ([a],[a])
mypartition p l = (filter p l, filter (not . p) l)



randList :: RandomGen g => g -> Integer -> [Integer]
randList gen 0 = []
randList gen n = x:xs
  where
    (x, gen') = random gen
    xs = randList gen' (n - 1)

main = do
  gen <- getStdGen
  let arg10000000 = randList gen 10000000
  defaultMain [
      bgroup "filters -- split list in half " [
        bench "partition100"         $ nf (partition (>= 50)) arg10000000
      , bench "mypartition100"       $ nf (mypartition (>= 50)) arg10000000
      ]
      ]

I ran the tests both with -O and without it and both times I get that the double traversals is better. 我用-O和没有它运行测试，两次我得到双遍历更好。

I am using ghc-7.10.3 with criterion-1.1.1.0 我使用ghc-7.10.3和criterion-1.1.1.0

My questions are: 我的问题是：

Is this expected? 这是预期的吗？
Am I using Criterion correctly? 我正确使用Criterion吗？ I know that laziness can be tricky and (filter p xs, filter (not . p) xs) will only do two traversals if both elements of the tuple are used. 我知道懒惰可能很棘手(filter p xs, filter (not . p) xs)如果使用元组的两个元素，则(filter p xs, filter (not . p) xs)将只执行两次遍历。
Does this has to do something with the way lists are handled in Haskell? 这是否与Haskell中处理列表的方式有关？

Thanks a lot! 非常感谢！

Answer 1

There is no black or white answer to the question. 这个问题没有黑色或白色的答案。 To dissect the problem consider the following code: 要剖析问题，请考虑以下代码：

import Control.DeepSeq
import Data.List (partition)
import System.Environment (getArgs)


mypartition :: (a -> Bool) -> [a] -> ([a],[a])
mypartition p l = (filter p l, filter (not . p) l)


main :: IO ()
main = do
  let cnt = 10000000
      xs = take cnt $ concat $ repeat [1 .. 100 :: Int]
  args <- getArgs
  putStrLn $ unwords $ "Args:" : args
  case args of
    [percent, fun]
      -> let p = (read percent >=)
         in case fun of
           "partition"      ->              print $ rnf $ partition   p xs
           "mypartition"    ->              print $ rnf $ mypartition p xs
           "partition-ds"   -> deepseq xs $ print $ rnf $ partition   p xs
           "mypartition-ds" -> deepseq xs $ print $ rnf $ mypartition p xs
           _ -> err
    _ -> err
  where
    err = putStrLn "Sorry, I do not understand."

I do not use Criterion to have a better control about the order of evaluation. 我不使用Criterion来更好地控制评估顺序。 To get timings, I use the +RTS -s runtime option. 为了获得时间，我使用+RTS -s运行时选项。 The different test case are executed using different command line options. 使用不同的命令行选项执行不同的测试用例。 The first command line option defines for which percentage of the data the predicate holds. 第一个命令行选项定义谓词所包含的数据百分比。 The second command line option chooses between different tests. 第二个命令行选项在不同的测试之间进行选择。

The tests distinguish two cases: 测试区分了两种情况：

The data is generated lazily (2nd argument partition or mypartition ). 数据是懒惰生成的（第二个参数partition或mypartition ）。
The data is already fully evaluated in memory (2nd argument partition-ds or mypartition-ds ). 数据已在内存中完全评估（第二个参数partition-ds或mypartition-ds ）。

The result of the partitioning is always evaluated from left to right, ie starting with the list that contains all the elements for which the predicate holds. 分区的结果始终从左到右进行计算，即从包含谓词所包含的所有元素的列表开始。

In case 1 partition has the advantage that elements of the first resulting list get discarded before all elements of the input list were even produced. 在情况1中， partition具有以下优点：在输出列表的所有元素被生成之前，第一结果列表的元素被丢弃。 Case 1 is especially good, if the predicate matches many elements, ie the first command line argument is large. 如果谓词匹配许多元素，即第一个命令行参数很大，则情况1特别好。

In case 2, partition cannot play out this advantage, since all elements are already in memory. 在案例2中， partition无法发挥这一优势，因为所有元素都已存在于内存中。

For mypartition , in any case all elements are held in memory after the first resulting list is evaluated, because they are needed again to compute the second resulting list. 对于mypartition ，在任何情况下，在评估第一个结果列表之后，所有元素都保存在内存中，因为再次需要它们来计算第二个结果列表。 Therefore there is not much of a difference between the two cases. 因此，这两种情况没有太大区别。

It seems, the more memory is used, the harder garbage collection gets. 看起来，使用的内存越多，垃圾收集就越难。 Therefore partition is well suited, if the predicate matches many elements and the lazy variant is used. 因此，如果谓词匹配许多元素并且使用了惰性变体，则partition非常适合。

Conversely, if the predicate does not match many elements or all elements are already in memory, mypartition performs better, since its recursion does not deal with pairs in contrast to partition . 相反，如果谓词与许多元素不匹配或者所有元素已经在内存中，则mypartition表现更好，因为它的递归不会与partition相对应。

The Stackoverflow question “ Irrefutable pattern does not leak memory in recursion, but why? Stackoverflow问题“ Irrefutable模式在递归时不会泄漏内存，但为什么？ ” might give some more insights about the handling of pairs in the recursion of partition . “可能会对partition递归中对的处理提供更多见解。

基准测试过滤器和分区

问题描述

1 个解决方案

解决方案1
5 已采纳 2016-08-04 17:09:04

基准测试过滤器和分区

问题描述

1 个解决方案

解决方案1 5 已采纳 2016-08-04 17:09:04

解决方案1
5 已采纳 2016-08-04 17:09:04