Here's my Python code:
X = [[0] * 1000] * 100
start = time()
for x in xrange(100):
for i in xrange(len(X)):
for j in xrange(len(X[i])):
X[i][j] += 1
print time() - start
My Cython code is the same:
X = [[0] * 1000] * 100
start = time()
for x in xrange(100):
for i in xrange(len(X)):
for j in xrange(len(X[i])):
X[i][j] += 1
print time() - start
Output:
Any other faster way to do the same above in Python or Cython?
Update: Any way to create an 2d array X with height indexing performance close to array int X[][] that in C/C++?
For now I am considering use Python C API to do the job.
One more thing, a numpy array does the same thing but so much slower(70 sec) than list in both pure Python and Cython.
Python:
X = np.zeros((100,1000),dtype=np.int32)
start = time()
for x in xrange(100):
for i in xrange(len(X)):
for j in xrange(len(X[i])):
X[i][j]+=1
If doing a lot of access to numeric array, which approach is the best?
To answer the question in your title, your Cython code beats your Python code because, despite the lack of cdef
to declare variables, C code is being generated for the for
loops (in addition to lots of extra C code to describe the Python objects). To speed up your Cython code use cdef
to declare the integers i
, j
and x
so they're no longer Python integers: eg cdef int i
. You can also declare C-type arrays in Cython which should improve performance further.
A quick way to get the same result with NumPy:
X = np.zeros((100, 1000), dtype=np.int32)
X += 10000
If you can help it, you should never use for
loops with NumPy arrays. They are completely different to lists in terms of memory usage.
Any other faster way to do the same above in Python or Cython?
The equivalent, faster code would be:
X = [[100 * 100] * 1000] * 100
In your code, you are creating a 1000
-long list of zeroes, then creating a 100
-long list of references to that list. Now, iterating 100
times over that 100
-long list results in each position being incremented 100 * 100 = 10000
times.
len(set(map(id, X)))
1
If you would like to end up with a list of 100
lists:
base = [100] * 1000
X = [list(base) for _ in xrange(100)]
len(set(map(id, X)))
100
Note that references to objects inside lists are still copied.
ajcr's answer probably is the fastest and simpliest one. You should first explicitly declare the datatype of your variables in the cython code. In addition I would make a prange
instead of a simple range
iterator for the outer loop. This will activate OpenMP multithreading what might speed up your code further but I really doubt that this solution will beat the numpy implementation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.