[英]Graphing criterion benchmarks taking different orders of magnitude of time
I have a Criterion benchmark where each bgroup
corresponds to a test, and within each bgroup
there are two bench
values of the test with different options. 我有一个Criterion基准,其中每个
bgroup
对应一个测试,并且在每个bgroup
有两个不同选项的测试bench
值。 For example: 例如:
main = defaultMain
[bgroup "test1" [bench "v1" test1_1, bench "v2" test1_2]
,bgroup "test2" [bench "v1" test2_1, bench "v2" test2_2
-- lots more tests
]
Within each bgroup
the two bench
tests are comparable. 在每个
bgroup
,两个bench
测试具有可比性。 However, test1
takes 2000 micro seconds, while test2
takes 45 micro seconds. 但是,
test1
需要2000微秒,而test2
需要45微秒。 The overview graph (which is most useful for what I want to do) displays different tests on the same axes, so I can clearly see the differences in test1
, but test2
is hard to see. 概览图(对我想要做的最有用)在同一轴上显示不同的测试,因此我可以清楚地看到
test1
的差异,但很难看到test2
。
Is it possible to normalise each bgroup
for plotting? 是否有可能将每个
bgroup
标准化以进行绘图? Or show them on separate axes? 或者在不同的轴上显示它们? Or should I dump the CSV data and plot what I want myself?
或者我应该转储CSV数据并绘制我想要的内容吗?
This issue is definitely amongst the shortcomings of Criterion. 这个问题肯定属于Criterion的缺点。 I've been bitten by the same problem multiple times.
我多次遭遇同样的问题。
The standard approach I take to work around this is just to generate an individual executable per each comparison unit. 我采用的标准方法是为每个比较单元生成一个单独的可执行文件。 A special
benchmark
target has been added in the latest versions of Cabal, so I declare a benchmark target per each comparison unit in the .cabal
file. 最新版本的Cabal中添加了一个特殊的
benchmark
目标,因此我在.cabal
文件中为每个比较单元声明了一个基准目标。 Then I can run each comparison using cabal bench [target-name]
. 然后我可以使用
cabal bench [target-name]
运行每个比较。 Yeah, it is all far from comforting, but it's the best I could come up with. 是的,这一切都远非安慰,但它是我能想到的最好的。
I have just released a library criterion-plus . 我刚刚发布了一个库标准加 。 It is a dome library over "criterion", which approaches the issue you're experiencing amongst others.
它是一个超过“标准”的圆顶图书馆,它可以解决您遇到的问题。 It allows you to declare multiple "standoffs", which generate independent "criterion" report files.
它允许您声明多个“对峙”,从而生成独立的“标准”报告文件。 Another important issue it fixes is the ability to exclude "setup/teardown" phases from benchmarking, which "criterion" does not let you do.
它修复的另一个重要问题是能够从基准测试中排除“设置/拆卸”阶段,“标准”不允许您这样做。
Here is an example of how this library is supposed to be used: 以下是如何使用此库的示例:
import CriterionPlus
import qualified SomeMySQLLib as MySQL
import qualified SomePostgreSQLLib as PostgreSQL
main =
benchmark $ do
standoff "Inserting rows" $ do
subject "MySQL" $ do
-- Exclude the "setup" phase from measurement:
pause
connection <- liftIO $ MySQL.openConnection
-- Measure what we want:
continue
liftIO $ MySQL.insertAThousandRows connection
-- Exclude the "teardown" phase from measurement:
pause
liftIO $ MySQL.closeConnection connection
subject "PostgreSQL" $ do
-- This is how we can exclude the "setup" phase from monad transformers:
pause
PostgreSQL.runSession $ do
lift $ continue
PostgreSQL.insertAThousandRows
-- Exclude "teardown":
lift $ pause
-- Each standoff generates an independent report file:
standoff "Querying" $ do
subject "MySQL" $ error "So on..."
subject "PostgreSQL" $ error "So on..."
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.