简体   繁体   English

Python,更快地使用类属性,还是函数中的变量?

[英]Python, faster to use class property, or variable in a function?

In Python, theoretically, which method should be faster out of test1 and test2 (assuming same value of x ). 在Python中,理论上,哪个方法应该更快地从test1test2 (假设x值相同)。 I have tried using %timeit but see very little difference. 我尝试使用%timeit但看到的差别很小。

import numpy as np

class Tester():

    def __init__(self):
        self.x = np.arange(100000)

    def test1(self):
        return np.sum(self.x * self.x )

    def test2(self,x):
        return np.sum(x*x)

In any implementation of Python, the time will be overwhelmingly dominated by the multiplication of two vectors with 100,000 elements each. 在Python的任何实现中,时间将由两个向量的乘法压倒性地占主导地位,每个向量具有100,000个元素。 Everything else is noise compared to that. 与此相比,其他一切都是噪音。 Make the vector much smaller if you're really interested in measuring other overheads. 如果你真的对测量其他开销很感兴趣,那么使矢量小得多。

In CPython, test2() will most likely be a little faster. 在CPython中, test2()很可能会更快一些。 It has an "extra" argument, but arguments are unpacked "at C speed" so that doesn't matter much. 它有一个“额外”参数,但是参数是以“C速度”解压缩的,所以这并不重要。 Arguments are accessed the same way as local variables, via the LOAD_FAST opcode, which is a simple array[index] access. 通过LOAD_FAST操作码访问参数的方式与局部变量相同,后者是一个简单的array[index]访问。

In test1() , each instance of self.x causes the string "x" to be looked up in the dictionary self.__dict__ . test1()self.x每个实例都会在字典self.__dict__字符串“x”。 That's slower than an indexed array access. 这比索引数组访问慢。 But compared to the time taken by the long-winded multiplication, it's basically nothing. 但与冗长的乘法所花费的时间相比,它基本上没什么。

I know this sort of misses the point of the question, but since you tagged the question with numpy and are looking at speed differences for a large array, I thought I would mention that there are faster solutions would be something else entirely. 我知道这个问题忽略了问题的重点,但是既然你用numpy标记问题并且正在考虑大型数组的速度差异,我想我会提到更快的解决方案将完全不同。

So, what you're doing is a dot product, so use numpy.dot , which is built with the multiplying and summing all together from an external library (LAPACK?) (For convenience I'll use the syntax of test1 , despite @Tim's answer, because no extra argument needs to be passed.) 所以,你正在做的是点积,所以使用numpy.dot ,它是从外部库(LAPACK?)一起构建的乘法和求和(为方便起见,我将使用test1的语法,尽管@蒂姆的答案,因为不需要通过额外的论证。)

def test3(self):
    return np.dot(self.x, self.x)

or possibly even faster (and certainly more general): 或者甚至更快(当然更一般):

def test4(self):
    return np.einsum('i,i->', self.x, self.x)

Here are some tests: 以下是一些测试:

In [363]: paste
class Tester():
    def __init__(self, n):
        self.x = np.arange(n)
    def test1(self):
        return np.sum(self.x * self.x)
    def test2(self, x):
        return np.sum(x*x)
    def test3(self):
        return np.dot(self.x, self.x)
    def test4(self):
        return np.einsum('i,i->', self.x, self.x)
## -- End pasted text --

In [364]: t = Tester(10000)

In [365]: np.allclose(t.test1(), [t.test2(t.x), t.test3(), t.test4()])
Out[365]: True

In [366]: timeit t.test1()
10000 loops, best of 3: 37.4 µs per loop

In [367]: timeit t.test2(t.x)
10000 loops, best of 3: 37.4 µs per loop

In [368]: timeit t.test3()
100000 loops, best of 3: 15.2 µs per loop

In [369]: timeit t.test4()
100000 loops, best of 3: 16.5 µs per loop

In [370]: t = Tester(10)

In [371]: timeit t.test1()
100000 loops, best of 3: 16.6 µs per loop

In [372]: timeit t.test2(t.x)
100000 loops, best of 3: 16.5 µs per loop

In [373]: timeit t.test3()
100000 loops, best of 3: 3.14 µs per loop

In [374]: timeit t.test4()
100000 loops, best of 3: 6.26 µs per loop

And speaking of small, almost syntactic, speed differences, think of using a method rather than standalone function: 说到小的,几乎语法上的速度差异,想想使用方法而不是独立函数:

def test1b(self):
    return (self.x*self.x).sum()

gives: 得到:

In [385]: t = Tester(10000)

In [386]: timeit t.test1()
10000 loops, best of 3: 40.6 µs per loop

In [387]: timeit t.test1b()
10000 loops, best of 3: 37.3 µs per loop

In [388]: t = Tester(3)

In [389]: timeit t.test1()
100000 loops, best of 3: 16.6 µs per loop

In [390]: timeit t.test1b()
100000 loops, best of 3: 14.2 µs per loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM