简体   繁体   English

为什么在 CPython 中 float * int 乘法比 int * float 快?

[英]Why is the float * int multiplication faster than int * float in CPython?

Basically, the expression 0.4 * a is consistently, and surprisingly, significantly faster than a * 0.4 .基本上,表达式0.4 * a始终且令人惊讶地明显快于a * 0.4 a being an integer. a是 integer。 And I have no idea why.我不知道为什么。

I speculated that it is a case of a LOAD_CONST LOAD_FAST bytecode pair being "more specialized" than the LOAD_FAST LOAD_CONST and I would be entirely satisfied with this explanation, except that this quirk seems to apply only to multiplications where types of multiplied variables differ.我推测这是一个LOAD_CONST LOAD_FAST字节码对比 LOAD_FAST LOAD_CONST “更专业”的LOAD_FAST LOAD_CONST ,我会对这个解释完全满意,除了这个怪癖似乎只适用于乘法变量类型不同的乘法。 (By the way, I can no longer find the link to this "bytecode instruction pair popularity ranking" I once found on github, does anyone have a link?) (顺便说一句,我曾经在github上找到的这个“字节码指令对流行度排名”的链接已经找不到了,有人有链接吗?)

Anyway, here are the micro benchmarks:无论如何,这里是微观基准:

$ python3.10 -m pyperf timeit -s"a = 9" "a * 0.4"
Mean +- std dev: 34.2 ns +- 0.2 ns
$ python3.10 -m pyperf timeit -s"a = 9" "0.4 * a"
Mean +- std dev: 30.8 ns +- 0.1 ns
$ python3.10 -m pyperf timeit -s"a = 0.4" "a * 9"
Mean +- std dev: 30.3 ns +- 0.3 ns
$ python3.10 -m pyperf timeit -s"a = 0.4" "9 * a"
Mean +- std dev: 33.6 ns +- 0.3 ns

As you can see - in the runs where the float comes first (2nd and 3rd) - it is faster.正如您所看到的 - 在浮动首先出现的运行(第 2 次和第 3 次)中 - 它更快。
So my question is where does this behavior come from?所以我的问题是这种行为从何而来? I'm 90% sure that it is an implementation detail of CPython, but I'm not that familiar with low level instructions to state that for sure.我 90% 确定它是 CPython 的实现细节,但我对 state 的低级指令不太熟悉。

It's CPython's implementation of the BINARY_MULTIPLY opcode.这是 CPython 对BINARY_MULTIPLY操作码的实现。 It has no idea what the types are at compile-time, so everything has to be figured out at run-time.它不知道编译时的类型是什么,所以一切都必须在运行时计算出来。 Regardless of what a and b may be, BINARY_MULTIPLY ends up inoking a.__mul__(b) .不管ab是什么, BINARY_MULTIPLY最终都会调用a.__mul__(b)

When a is of int type int.__mul__(a, b) has no idea what to do unless b is also of int type.a是 int 类型时, int.__mul__(a, b)不知道该怎么做,除非b也是 int 类型。 It returns Py_RETURN_NOTIMPLEMENTED (an internal C constant).它返回Py_RETURN_NOTIMPLEMENTED (一个内部 C 常量)。 This is in longobject.c 's CHECK_BINOP macro.这是在longobject.cCHECK_BINOP宏中。 The interpreter sess that, and effectively says "OK, a.__mul__ has no idea what to do, so let's give b.__rmul__ a shot at it".解释器对此进行了分析,并有效地说“好的, a.__mul__不知道该做什么,所以让我们b.__rmul__ ”。 None of that is free - it all takes time.这些都不是免费的——这一切都需要时间。

float.__mul__(b, a) (same as float.__rmul__ ) does know what to do with an int (converts it to float first), so that succeeds. float.__mul__(b, a) (与float.__rmul__相同)确实知道如何处理 int (首先将其转换为 float),因此成功。

But when a is of float type to begin with, we go to float.__mul__ first, and that's the end of it.但是当a开始是 float 类型时,我们 go 先到float.__mul__ ,到此结束。 No time burned figuring out that the int type doesn't know what to do.没有时间花时间弄清楚 int 类型不知道该做什么。

The actual code is quite a bit more involved than the above pretends, but that's the gist of it.实际的代码比上面假装的要复杂得多,但这就是它的要点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM