简体   繁体   English

为测试目的对 function 呼叫计时的最简单方法是什么?

[英]What is the easiest way to time a function call for testing purposes?

So I'm still kinda green in Rust, but coming from Python I find this scenario very confusing in general.所以我在 Rust 中仍然有点陌生,但是来自 Python 我发现这种情况一般来说非常混乱。

I like Python because it's very easy if you want to time a block of code or just a function call:我喜欢 Python,因为如果您想为一段代码计时或只是 function 调用,这非常容易:

print(timeit('a = "hee hee la le dah"; my_awesome_fn()', number = 1_000, globals=globals()))

Then just call python script.py or better yet just use the green "run" button in the IDE and you can call the script.然后只需调用python script.py或更好,但只需使用 IDE 中的绿色“运行”按钮即可调用脚本。 But I'm having trouble finding functional equivalent in Rust.但我无法在 Rust 中找到功能等价物。

I know there is concept in Rust ecosystem called benchmarking and some libs like criterion exist for this purpose.我知道 Rust 生态系统中有一个称为基准测试的概念,为此存在一些像criterion这样的库。 The problem is that I know nothing about advanced math and statistics (can treat me like a clueless moron essentially) and I doubt I can benefit a lot from a framework or harness such as this.问题是我对高等数学和统计学一无所知(本质上可以把我当作一个无能的白痴)而且我怀疑我能否从这样的框架或工具中受益匪浅。

So I am simply just curious how can I use tests in cargo to test a block of code in Rust or better yet even a function call.所以我只是好奇如何在 cargo 中使用tests来测试 Rust 或更好的代码块,甚至是 function 调用。

For example assume I have similar function in rust that I want to call multiple times and then check how does performance change etc:例如,假设我想多次调用 rust 中的类似 function,然后检查性能如何变化等:

pub fn my_awesome_fn() {
    trace!("getting ready to do something cool...");
    std::thread::sleep(std::time::Duration::from_millis(500));
    info!("finished!");
}

how can I simply just time this func my_awesome_fn in rust?我怎样才能简单地将这个 func my_awesome_fn到 rust? I guess i'm looking for an equivalent like timeit in python or something similar.我想我在 python 或类似的东西中寻找类似timeit的等价物。 Ideally it should be striaghtforward to use and assume i don't know anything about what I'm doing.理想情况下,应该直接使用并假设我对自己在做什么一无所知。 I'm curious if there's an existing library or framework I can leverage for this purposes.我很好奇是否有我可以为此目的利用的现有库或框架。

Disclaimer: I've never used timeit免责声明:我从未使用过timeit

A very quick answer solution is to write a function like:一个非常快速的解决方案是写一个 function 像:

fn timeit<F: Fn() -> T, T>(f: F) -> T {
  let start = SystemTime::now();
  let result = f();
  let end = SystemTime::now();
  let duration = end.duration_since(start).unwrap();
  println!("it took {} seconds", duration.as_secs());
  result
}

which you can use to "wrap" another function call:您可以使用它来“包装”另一个 function 电话:

fn main() {
  let x = timeit(|| my_expensive_function());
}

However, if you're trying to understand the time a function takes for the purpose of performance optimizations , this approach is likely too crude.但是,如果您试图了解 function为性能优化所花费的时间,则此方法可能过于粗糙。

The problem is that I know nothing about advanced math and statistics问题是我对高等数学和统计学一无所知

That's arguably one of the main advantages of criterion , it "abstracts the maths away", in a sense.这可以说是criterion的主要优点之一,从某种意义上说,它“抽象了数学”。

It uses statistical approaches to give you a better insight into whether differences between benchmarking runs are a product of "randomness", or whether there is a meaningful difference between the code on each run.它使用统计方法让您更好地了解基准测试运行之间的差异是否是“随机性”的产物,或者每次运行的代码之间是否存在有意义的差异。

To the end user, it essentially gives you a report saying either "a significant change was observed" or "no significant change was observed".对于最终用户,它实质上是为您提供一份报告,说明“观察到重大变化”或“未观察到重大变化”。 It does far more than that, but to fully grasp its capabilities, it might be worth reading up on "hypothesis testing".它所做的远不止于此,但要完全掌握其功能,可能值得阅读“假设检验”。

If you're OK using nightly Rust, you can also use #[bench] tests:如果您可以使用 nightly Rust,您还可以使用#[bench]测试:

#![feature(test)]
extern crate test;

#[bench]
fn bench_my_func(b: &mut Bencher) {
  b.iter(|| my_func(black_box(100));
}

which you can run with cargo bench .你可以用cargo bench运行它。 These are a bit easier to set up than criterion , but do less of the interesting stats (ie you'll have to do it yourself), but they're a very "quick and dirty" way to get a feel for the runtime of your code.这些比criterion更容易设置,但做的有趣统计较少(即你必须自己做),但它们是一种非常“快速和肮脏”的方式来感受你的代码。

A word of warning, benchmarking code is hard.一句警告,基准测试代码很难。 You may be surprised at what is actually going on under the hood, and you may find yourself benchmarking the wrong thing.您可能会对引擎盖下实际发生的事情感到惊讶,并且您可能会发现自己对错误的事情进行了基准测试。

Common "gotchas" are:常见的“陷阱”是:

  • rustc can generally identify "useless" code, and simply skip calculating it. rustc通常可以识别“无用”代码,并简单地跳过计算。 The black_box function can be used to hide the meaning of some data from the optimizer, though it is not without its own overhead black_box function 可用于向优化器隐藏某些数据的含义,尽管它并非没有自己的开销
  • in a similar vein, LLVM does some slightly spooky optimizations relating to polynomials for example .以类似的方式,LLVM 做了一些与多项式相关稍微有点怪异的优化。 You might find that your function call is being optimized away into a constant/simple arithmetic.您可能会发现您的 function 调用被优化为常量/简单算术。 In some cases, this is great.在某些情况下,这很棒。 You've written your function in such a way that LLVM can reduce it to something trivial, In other cases, you're now just benchmarking the multiplication instruction on your CPU.您以 LLVM 可以将其简化为微不足道的方式编写 function,在其他情况下,您现在只是在 CPU 上对乘法指令进行基准测试。 which is unlikely to be what you want.这不太可能是你想要的。 Use your best judgement使用你最好的判断
  • benchmarking the wrong thing - some things are significantly more expensive than others, in ways that might seem odd to someone with a python background.对错误的东西进行基准测试 - 有些东西比其他东西贵得多,对于具有 python 背景的人来说可能看起来很奇怪。 For example, cloning a String (even a very short one) might be 2-3 orders of magnitude slower than finding the first character.例如,克隆一个String (即使是一个非常短的字符串)可能比查找第一个字符慢 2-3 个数量级。 Consider the following:考虑以下:
fn str_len(s: String) -> usize {
  s.len()
}

#[bench]
fn bench_str_len(b: &mut Bencher) {
  let s = String::from("hello");  
  b.iter(|| str_len(s.clone()));
}

Because String::clone involves a heap allocation, but s.len() is just a field access, it will dominate the results.因为String::clone涉及堆分配,而s.len()只是字段访问,所以会支配结果。 Instead, if str_len took a &str , it would become more representative (though this is a contrived case).相反,如果str_len采用&str ,它将变得更具代表性(尽管这是人为的情况)。

TLDR Be careful what your benchmark code is doing. TLDR 小心你的基准代码在做什么。 The Rust Playground's "view assembly" tool (or godbolt.org) are your friends. Rust Playground 的“查看程序集”工具(或 godbolt.org)是您的朋友。 You don't need to be an assembly expert, but it can help give you some idea what's going on under the hood您不需要成为装配专家,但它可以帮助您了解幕后发生的事情

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM