字符串比较与 int 与 bool 的成本

Question

When defining functions, with parameters, we sometimes have to do a tradeoff between readability and speed.在使用参数定义函数时，我们有时必须在可读性和速度之间进行权衡。 Here are three examples with str vs. int vs. bool comparison:以下是 str 与 int 与 bool 比较的三个示例：

def f1(mode, x, y):          # the most "explicit" solution, best for readability, ... but uses a str comparison
    if mode == 'mode1_method_foo':
        return 0             # IRL, many lines here
    elif mode == 'mode2_method_bar':
        return x ** 12       # IRL, many lines here too

def f2(mode, x, y):          # with int comparison
    if mode == 1:  
        return 0
    elif mode == 2:
        return x ** 12

def f3(mode, x, y):          # with bool
    if mode:
        return 0
    else:
        return x ** 12

Surprisingly, the cost of this comparison seems non-negligible:令人惊讶的是，这种比较的成本似乎不可忽略：

import time, random
start = time.time()
for i in range(1000*1000):
    x = random.random()
    y = random.random()
    #f1('mode2_method_bar', x, y)  # 0.760 sec
    #f2(2, x, y)                   # 0.700 sec
    f3(False, x, y)                # 0.644 sec
print(time.time() - start)

This is not a very clean way to measure the performance cost.这不是衡量性能成本的一种非常干净的方法。

How to do a better measurement of the cost of using a string parameter vs. a int or bool?如何更好地衡量使用字符串参数与使用 int 或 bool 的成本？

Example: if my program does 10'000 such comparisons per second, how much time do I lose by using f1 instead of f3 ?示例：如果我的程序每秒进行 10'000 次这样的比较，那么使用f1而不是f3会浪费多少时间？

Context: using strings to name methods is often used in various APIs / libraries, example: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html : method='Nelder-Mead' , etc. In the case of this Scipy function, there is no performance problem because the cost of string comparison is many order of magnitudes smaller than the cost of what the function actually does; Context: using strings to name methods is often used in various APIs / libraries, example: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html : method='Nelder-Mead'等。在这个 Scipy function 的情况下，没有性能问题，因为字符串比较的成本比 ZC1C425268E68385D1AB5074C17A 的实际成本小很多数量级； still this issue might be interesting for other functions.对于其他功能，这个问题仍然可能很有趣。

Answer 1

Generally, when measuring (and comparing) performance, it's very important to measure exactly the part of interest.通常，在测量（和比较）性能时，准确测量感兴趣的部分非常重要。

Let's assume we want to measure and compare performance for 3 things ( a , b , and c ).假设我们要测量和比较 3 件事（ a 、 b和c ）的性能。 Now, let's assign to each a (theoretical) performance score:现在，让我们为每个人分配一个（理论）性能分数：

a : 1一个： 1
b : 4乙： 4
c : 8 c ： 8

Like in time performance tests, we'll start from the premise that the lower the score, the better (performance).就像在时间性能测试中一样，我们将从分数越低越好（性能）的前提开始。

In this case, the above scores ( 1, 4 , 8 - also called absolute scores ) are pretty relevant and meaningful for everyone.在这种情况下，上述分数（1、4、8 -也称为绝对分数）对每个人都非常相关和有意义。

But let's also calculate the relative scores : one way is to take the variant that performs worst ( c ), and express all as percents relative to it:但是让我们也计算相对分数：一种方法是采用表现最差的变体（ c ），并将所有内容表示为相对于它的百分比：

a : 1 / 8 = 0.125一： 1 / 8 = 0.125
b : 4 / 8 = 0.500 b : 4 / 8 = 0.500
c : 8 / 8 = 1.000 c ： 8 / 8 = 1.000

So far so good.到目前为止，一切都很好。 The absolute and relative scores are visibly proportional.绝对分数和相对分数明显成正比。

But, in the real world in some cases measuring exactly some specific item is hard (almost impossible if taking into account factors from outside the program).但是，在现实世界中，在某些情况下准确测量某些特定项目是很困难的（如果考虑到程序外部的因素，几乎是不可能的）。 In those cases, the item + some overhead are measured together and the final score is the sum (not necessarily the arithmetical addition) of the item and overhead scores.在这些情况下，项目+一些开销是一起测量的，最终分数是项目和开销分数的总和（不一定是算术加法）。 Te goal here is to have the overhead as insignificant as possible to reduce its impact (weight) on the overall score.这里的目标是使开销尽可能小，以减少其对总分的影响（权重）。
For example, let's take a huge overhead (call it d ) with the score:例如，让我们用分数计算一个巨大的开销（称之为d ）：

d : 2 d : 2

The above measurements (including the overhead):上述测量（包括开销）：

a ( a + d ): 3一(一+ d ): 3
a ( b + d ): 6 a ( b + d ): 6
c ( c + d ): 10 c （ c + d ）： 10

Things look a bit different.事情看起来有点不同。 Now, the relative scores:现在，相对分数：

a : (1 + 2) / (8 + 2) = 0.333 ... a : (1 + 2) / (8 + 2) = 0.333 ...
b : (4 + 2) / (8 + 2) = 0.600 b : (4 + 2) / (8 + 2) = 0.600
c : (8 + 2) / (8 + 2) = 1.000 c : (8 + 2) / (8 + 2) = 1.000

As seen, a 's relative score is almost triple (actually it's 2.(6) times higher than) its previous value (without overhead).如图所示， a的相对分数几乎是其先前值的三倍（实际上是它的2.(6)倍）（没有开销）。

Same thing is happening in your case.你的情况也发生了同样的事情。 The 2 ** 12 calculation is an overhead. 2 ** 12计算是开销。 I'm not saying that the measurements are wrong, but they are not accurate either.我并不是说测量结果是错误的，但它们也不准确。 What is for sure is that if one item performs better than another without overhead, it will also perform better with overhead, but as I said, the comparisons between them will yield incorrect results.可以肯定的是，如果一个项目在没有开销的情况下比另一个项目表现更好，那么它也会在有开销的情况下表现更好，但正如我所说，它们之间的比较会产生不正确的结果。

I modified your code and added 3 more functions, where I simply got rid of the exponentiation.我修改了您的代码并添加了另外 3 个函数，在这里我只是摆脱了求幂。

code00.py :代码00.py ：

#!/usr/bin/env python

import sys
import timeit
import random


def f0(mode, x, y):          # the most "explicit" solution, best for readability, ... but uses a str comparison
    if mode == 'mode1_method_foo':
        return 0             # IRL, many lines here
    elif mode == 'mode2_method_bar':
        return x ** 12       # IRL, many lines here too


def f1(mode, x, y):          # with int comparison
    if mode == 1:  
        return 0
    elif mode == 2:
        return x ** 12


def f2(mode, x, y):          # with bool
    if mode:
        return 0
    else:
        return x ** 12


# The modified versions
def f3(mode, x, y):          # the most "explicit" solution, best for readability, ... but uses a str comparison
    if mode == 'mode1_method_foo':
        return 0             # IRL, many lines here
    elif mode == 'mode2_method_bar':
        return 1       # IRL, many lines here too


def f4(mode, x, y):          # with int comparison
    if mode == 1:  
        return 0
    elif mode == 2:
        return 1


def f5(mode, x, y):          # with bool
    if mode:
        return 0
    else:
        return 1


x = random.random()
y = random.random()
args = ["mode2_method_bar", 2, False]


def main(*argv):
    results = {}
    funcs = [
        f0,
        f1,
        f2,
        f3,
        f4,
        f5,
    ]

    print("Testing functions")
    for i, func in enumerate(funcs):
        t = timeit.timeit(stmt="func(arg, x, y)", setup="from __main__ import {0:s} as func, x, y, args;arg=args[int(func.__name__[-1]) % 3]".format(func.__name__), number=10000000)
        results.setdefault(i // 3, []).append((func.__name__, t))
        print("  Done with {0:s}".format(func.__name__))
    print("\n  Functions absolute scores (seconds)")
    for k in results:
        for result in results[k]:
            print("    {0:s}: {1:.6f}".format(*result))
    print("\n  Functions relative scores (percents - compared to the variant that took longest)")
    for k, v in results.items():
        print("    Batch {0:d}".format(k))
        longest = max(v, key=lambda x: x[-1])[-1]
        for result in v:
            print("    {0:s}: {1:.6f}".format(result[0], result[1] / longest))


if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    main(*sys.argv[1:])
    print("\nDone.")

Output : Output ：

 [cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q061250859]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code00.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 Testing functions Done with f0 Done with f1 Done with f2 Done with f3 Done with f4 Done with f5 Functions absolute scores (seconds) f0: 3.833778 f1: 3.591715 f2: 3.083926 f3: 1.671274 f4: 1.467826 f5: 1.118479 Functions relative scores (percents - compared to the variant that took longest) Batch 0 f0: 1.000000 f1: 0.936861 f2: 0.804409 Batch 1 f3: 1.000000 f4: 0.878268 f5: 0.669238 Done. [cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q061250859]> "e:\Work\Dev\VEnvs\py_pc032_02.07.17_test0\Scripts\python.exe" code00.py Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 20:49:36) [MSC v.1500 32 bit (Intel)] 32bit on win32 Testing functions Done with f0 Done with f1 Done with f2 Done with f3 Done with f4 Done with f5 Functions absolute scores (seconds) f0: 3.947840 f1: 3.613213 f2: 3.384385 f3: 1.898074 f4: 1.604591 f5: 1.315465 Functions relative scores (percents - compared to the variant that took longest) Batch 0 f0: 1.000000 f1: 0.915238 f2: 0.857275 Batch 1 f3: 1.000000 f4: 0.845379 f5: 0.693053 Done.

Notes :备注：

I ran the program with both Python 2 and Python 3 (and as seen the latter has some speed improvements)我使用Python 2和Python 3运行程序（如所见，后者有一些速度改进）
The key point here are the relative scores drops from Batch 0 (the original functions) to Batch 1 (the original functions without 2 ** 12 - meaning less overhead)这里的关键点是从第 0 批（原始函数）到第 1 批（没有2 ** 12的原始函数 - 意味着更少的开销）的相对分数下降
The results might vary between runs (due to external factors: eg OS scheduling the process to a CPU ), so to have an idea that's as close as possible to reality multiple tests must be run运行之间的结果可能会有所不同（由于外部因素：例如操作系统将进程调度到CPU ），因此要获得尽可能接近现实的想法，必须运行多个测试

There's one massive overhead in your code : the fact that the code you want to measure is located in functions.您的代码中有一个巨大的开销：您要测量的代码位于函数中。 Calling the function, passing the arguments and return value are quite costly compared to the comparison per se , and I'd bet that by timing just comparisons, the relative (and also absolute) scores would be quite different.与比较本身相比，调用 function、传递 arguments 和返回值的成本相当高，我敢打赌，通过仅计时比较，相对（以及绝对）分数会大不相同。

字符串比较与 int 与 bool 的成本

问题描述

1 个解决方案

解决方案1
0 2020-04-16 18:07:57

字符串比较与 int 与 bool 的成本

问题描述

1 个解决方案

解决方案1 0 2020-04-16 18:07:57

解决方案1
0 2020-04-16 18:07:57