简体   繁体   中英

Why is using a Cython list faster than using a Python list?

Here's my Python code:

X = [[0] * 1000] * 100
start = time()
for x in xrange(100):
    for i in xrange(len(X)):
        for j in xrange(len(X[i])):
            X[i][j] += 1
print  time() - start

My Cython code is the same:

X = [[0] * 1000] * 100
start = time()
for x in xrange(100):
    for i in xrange(len(X)):
        for j in xrange(len(X[i])):
            X[i][j] += 1
print  time() - start

Output:

  • Python cost: 2.86 sec
  • Cython cost: 0.41 sec

Any other faster way to do the same above in Python or Cython?

Update: Any way to create an 2d array X with height indexing performance close to array int X[][] that in C/C++?

For now I am considering use Python C API to do the job.

One more thing, a numpy array does the same thing but so much slower(70 sec) than list in both pure Python and Cython.

Python:

X = np.zeros((100,1000),dtype=np.int32)
start = time()
for x in xrange(100):
    for i in xrange(len(X)):
        for j in xrange(len(X[i])):
            X[i][j]+=1

If doing a lot of access to numeric array, which approach is the best?

To answer the question in your title, your Cython code beats your Python code because, despite the lack of cdef to declare variables, C code is being generated for the for loops (in addition to lots of extra C code to describe the Python objects). To speed up your Cython code use cdef to declare the integers i , j and x so they're no longer Python integers: eg cdef int i . You can also declare C-type arrays in Cython which should improve performance further.

A quick way to get the same result with NumPy:

X = np.zeros((100, 1000), dtype=np.int32)
X += 10000

If you can help it, you should never use for loops with NumPy arrays. They are completely different to lists in terms of memory usage.

Any other faster way to do the same above in Python or Cython?

The equivalent, faster code would be:

X = [[100 * 100] * 1000] * 100

In your code, you are creating a 1000 -long list of zeroes, then creating a 100 -long list of references to that list. Now, iterating 100 times over that 100 -long list results in each position being incremented 100 * 100 = 10000 times.

len(set(map(id, X)))
1

If you would like to end up with a list of 100 lists:

base = [100] * 1000
X = [list(base) for _ in xrange(100)]
len(set(map(id, X)))
100

Note that references to objects inside lists are still copied.

ajcr's answer probably is the fastest and simpliest one. You should first explicitly declare the datatype of your variables in the cython code. In addition I would make a prange instead of a simple range iterator for the outer loop. This will activate OpenMP multithreading what might speed up your code further but I really doubt that this solution will beat the numpy implementation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM