图形标准基准采用不同的时间量级

Question

I have a Criterion benchmark where each bgroup corresponds to a test, and within each bgroup there are two bench values of the test with different options. 我有一个Criterion基准，其中每个bgroup对应一个测试，并且在每个bgroup有两个不同选项的测试bench值。 For example: 例如：

main = defaultMain
    [bgroup "test1" [bench "v1" test1_1, bench "v2" test1_2]
    ,bgroup "test2" [bench "v1" test2_1, bench "v2" test2_2
    -- lots more tests
    ]

Within each bgroup the two bench tests are comparable. 在每个bgroup ，两个bench测试具有可比性。 However, test1 takes 2000 micro seconds, while test2 takes 45 micro seconds. 但是， test1需要2000微秒，而test2需要45微秒。 The overview graph (which is most useful for what I want to do) displays different tests on the same axes, so I can clearly see the differences in test1 , but test2 is hard to see. 概览图（对我想要做的最有用）在同一轴上显示不同的测试，因此我可以清楚地看到test1的差异，但很难看到test2 。

Is it possible to normalise each bgroup for plotting? 是否有可能将每个bgroup标准化以进行绘图？ Or show them on separate axes? 或者在不同的轴上显示它们？ Or should I dump the CSV data and plot what I want myself? 或者我应该转储CSV数据并绘制我想要的内容吗？

Answer 1

This issue is definitely amongst the shortcomings of Criterion. 这个问题肯定属于Criterion的缺点。 I've been bitten by the same problem multiple times. 我多次遭遇同样的问题。

The standard approach I take to work around this is just to generate an individual executable per each comparison unit. 我采用的标准方法是为每个比较单元生成一个单独的可执行文件。 A special benchmark target has been added in the latest versions of Cabal, so I declare a benchmark target per each comparison unit in the .cabal file. 最新版本的Cabal中添加了一个特殊的benchmark目标，因此我在.cabal文件中为每个比较单元声明了一个基准目标。 Then I can run each comparison using cabal bench [target-name] . 然后我可以使用cabal bench [target-name]运行每个比较。 Yeah, it is all far from comforting, but it's the best I could come up with. 是的，这一切都远非安慰，但它是我能想到的最好的。

Answer 2

I have just released a library criterion-plus . 我刚刚发布了一个库标准加 。 It is a dome library over "criterion", which approaches the issue you're experiencing amongst others. 它是一个超过“标准”的圆顶图书馆，它可以解决您遇到的问题。 It allows you to declare multiple "standoffs", which generate independent "criterion" report files. 它允许您声明多个“对峙”，从而生成独立的“标准”报告文件。 Another important issue it fixes is the ability to exclude "setup/teardown" phases from benchmarking, which "criterion" does not let you do. 它修复的另一个重要问题是能够从基准测试中排除“设置/拆卸”阶段，“标准”不允许您这样做。

Here is an example of how this library is supposed to be used: 以下是如何使用此库的示例：

import CriterionPlus
import qualified SomeMySQLLib as MySQL
import qualified SomePostgreSQLLib as PostgreSQL

main = 
  benchmark $ do
    standoff "Inserting rows" $ do
      subject "MySQL" $ do
        -- Exclude the "setup" phase from measurement:
        pause
        connection <- liftIO $ MySQL.openConnection
        -- Measure what we want:
        continue
        liftIO $ MySQL.insertAThousandRows connection
        -- Exclude the "teardown" phase from measurement:
        pause
        liftIO $ MySQL.closeConnection connection
      subject "PostgreSQL" $ do
        -- This is how we can exclude the "setup" phase from monad transformers:
        pause
        PostgreSQL.runSession $ do
          lift $ continue
          PostgreSQL.insertAThousandRows
          -- Exclude "teardown":
          lift $ pause
    -- Each standoff generates an independent report file:
    standoff "Querying" $ do
      subject "MySQL" $ error "So on..."
      subject "PostgreSQL" $ error "So on..."

图形标准基准采用不同的时间量级

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-04-17 12:53:56

解决方案2
2 2014-04-25 18:34:23

图形标准基准采用不同的时间量级

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-04-17 12:53:56

解决方案2 2 2014-04-25 18:34:23

解决方案1
3 已采纳 2014-04-17 12:53:56

解决方案2
2 2014-04-25 18:34:23