[英]Python, faster to use class property, or variable in a function?
In Python, theoretically, which method should be faster out of test1
and test2
(assuming same value of x
). 在Python中,理论上,哪个方法应该更快地从test1
和test2
(假设x
值相同)。 I have tried using %timeit
but see very little difference. 我尝试使用%timeit
但看到的差别很小。
import numpy as np
class Tester():
def __init__(self):
self.x = np.arange(100000)
def test1(self):
return np.sum(self.x * self.x )
def test2(self,x):
return np.sum(x*x)
In any implementation of Python, the time will be overwhelmingly dominated by the multiplication of two vectors with 100,000 elements each. 在Python的任何实现中,时间将由两个向量的乘法压倒性地占主导地位,每个向量具有100,000个元素。 Everything else is noise compared to that. 与此相比,其他一切都是噪音。 Make the vector much smaller if you're really interested in measuring other overheads. 如果你真的对测量其他开销很感兴趣,那么使矢量小得多。
In CPython, test2()
will most likely be a little faster. 在CPython中, test2()
很可能会更快一些。 It has an "extra" argument, but arguments are unpacked "at C speed" so that doesn't matter much. 它有一个“额外”参数,但是参数是以“C速度”解压缩的,所以这并不重要。 Arguments are accessed the same way as local variables, via the LOAD_FAST
opcode, which is a simple array[index]
access. 通过LOAD_FAST
操作码访问参数的方式与局部变量相同,后者是一个简单的array[index]
访问。
In test1()
, each instance of self.x
causes the string "x" to be looked up in the dictionary self.__dict__
. 在test1()
, self.x
每个实例都会在字典self.__dict__
字符串“x”。 That's slower than an indexed array access. 这比索引数组访问慢。 But compared to the time taken by the long-winded multiplication, it's basically nothing. 但与冗长的乘法所花费的时间相比,它基本上没什么。
I know this sort of misses the point of the question, but since you tagged the question with numpy
and are looking at speed differences for a large array, I thought I would mention that there are faster solutions would be something else entirely. 我知道这个问题忽略了问题的重点,但是既然你用numpy
标记问题并且正在考虑大型数组的速度差异,我想我会提到更快的解决方案将完全不同。
So, what you're doing is a dot product, so use numpy.dot
, which is built with the multiplying and summing all together from an external library (LAPACK?) (For convenience I'll use the syntax of test1
, despite @Tim's answer, because no extra argument needs to be passed.) 所以,你正在做的是点积,所以使用numpy.dot
,它是从外部库(LAPACK?)一起构建的乘法和求和(为方便起见,我将使用test1
的语法,尽管@蒂姆的答案,因为不需要通过额外的论证。)
def test3(self):
return np.dot(self.x, self.x)
or possibly even faster (and certainly more general): 或者甚至更快(当然更一般):
def test4(self):
return np.einsum('i,i->', self.x, self.x)
Here are some tests: 以下是一些测试:
In [363]: paste
class Tester():
def __init__(self, n):
self.x = np.arange(n)
def test1(self):
return np.sum(self.x * self.x)
def test2(self, x):
return np.sum(x*x)
def test3(self):
return np.dot(self.x, self.x)
def test4(self):
return np.einsum('i,i->', self.x, self.x)
## -- End pasted text --
In [364]: t = Tester(10000)
In [365]: np.allclose(t.test1(), [t.test2(t.x), t.test3(), t.test4()])
Out[365]: True
In [366]: timeit t.test1()
10000 loops, best of 3: 37.4 µs per loop
In [367]: timeit t.test2(t.x)
10000 loops, best of 3: 37.4 µs per loop
In [368]: timeit t.test3()
100000 loops, best of 3: 15.2 µs per loop
In [369]: timeit t.test4()
100000 loops, best of 3: 16.5 µs per loop
In [370]: t = Tester(10)
In [371]: timeit t.test1()
100000 loops, best of 3: 16.6 µs per loop
In [372]: timeit t.test2(t.x)
100000 loops, best of 3: 16.5 µs per loop
In [373]: timeit t.test3()
100000 loops, best of 3: 3.14 µs per loop
In [374]: timeit t.test4()
100000 loops, best of 3: 6.26 µs per loop
And speaking of small, almost syntactic, speed differences, think of using a method rather than standalone function: 说到小的,几乎语法上的速度差异,想想使用方法而不是独立函数:
def test1b(self):
return (self.x*self.x).sum()
gives: 得到:
In [385]: t = Tester(10000)
In [386]: timeit t.test1()
10000 loops, best of 3: 40.6 µs per loop
In [387]: timeit t.test1b()
10000 loops, best of 3: 37.3 µs per loop
In [388]: t = Tester(3)
In [389]: timeit t.test1()
100000 loops, best of 3: 16.6 µs per loop
In [390]: timeit t.test1b()
100000 loops, best of 3: 14.2 µs per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.