简体   繁体   中英

NumPy List Comprehension Syntax

I'd like to be able to use list comprehension syntax to work with NumPy arrays easily.

For instance, I would like something like the below obviously wrong code to just reproduce the same array.

>>> X = np.random.randn(8,4)
>>> [[X[i,j] for i in X] for j in X[i]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: arrays used as indices must be of integer (or boolean) type

What is the easy way to do this, to avoid using range(len(X) ?

First, you should not be using NumPy arrays as lists of lists.

Second, let's forget about NumPy; your listcomp doesn't make any sense in the first place, even for lists of lists.

In the inner comprehension, for i in X is going to iterate over the rows in X. Those rows aren't numbers, they're lists (or, in NumPy, 1D arrays), so X[i] makes no sense whatsoever. You may have wanted i[j] instead.

In the outer comprehension, for j in X[i] has the same problem, but is has an even bigger problem: there is no i value. You have a comprehension looping over each i inside this comprehension.

If you're confused by a comprehension, write it out as an explicit for statement, as explained in the tutorial section on List Comprehensions :

tmp = []
for j in X[i]:
    tmp.append([X[i,j] for i in X])

… which expands to:

tmp = []
for j in X[i]:
    tmp2 = []
    for i in X:
        tmp2.append(X[i,j])
    tmp.append(tmp2)

… which should make it obvious what's wrong here.


I think what you wanted was:

[[cell for cell in row] for row in X]

Again, turn it back into explicit for statements:

tmp = []
for row in X;
    tmp2 = []
    for cell in row:
        tmp2.append(cell)
    tmp.append(tmp2)

That's obviously right.

Or, if you really want to use indexing (but you don't):

[[X[i][j] for j in range(len(X[i]))] for i in range(len(X))]

So, back to NumPy. In NumPy terms, that last version is:

[[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]

… and if you want to go in column-major order instead of row-major, you can (unlike with a list of lists):

[[X[i,j] for i in range(X.shape[0])] for j in range(X.shape[1])]

… but that will of course transpose the array, which isn't what you wanted to do.

The one thing you can't do is mix up column-major and row-major order in the same expression, because you end up with nonsense.


Of course the right way to make a copy of an array is to use the copy method:

X.copy()

Just as the right way to transpose an array is:

X.T

The easy way is to not do this. Use numpy's implicit vectorization instead. For example, if you have arrays A and B as follows:

A = numpy.array([[1, 3, 5],
                 [2, 4, 6],
                 [9, 8, 7]])
B = numpy.array([[5, 3, 5],
                 [3, 5, 3],
                 [5, 3, 5]])

then the following code using list comprehensions:

C = numpy.array([[A[i, j] * B[i, j] for j in xrange(A.shape[1])]
                 for i in xrange(A.shape[0])])

can be much more easily written as

C = A * B

It'll also run much faster. Generally, you will produce faster, clearer code if you don't use list comprehensions with numpy than if you do.

If you really want to use list comprehensions, standard Python list-comprehension-writing techniques apply. Iterate over the elements, not the indices:

C = numpy.array([[a*b for a, b in zip(a_row, b_row)]
                 for a_row, b_row in zip(A, B)]

Thus, your example code would become

numpy.array([[elem for elem in x_row] for x_row in X])

Another option (though not necessarily performant) is to rethink your problem as a map instead of a comprehension and write a ufunc:

http://docs.scipy.org/doc/numpy/reference/ufuncs.html

You can call functional-lite routines like:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_over_axes.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

Etc.

Do you mean following?

>>> [[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]
[[0.62757350000000001, -0.64486080999999995, -0.18372566000000001, 0.78470704000000002],
 [1.78209799, -1.336448459999999 9, -1.3851422200000001, -0.49668994],
 [-0.84148266000000005, 0.18864597999999999, -1.1135151299999999, -0.40225053999999 999],
 [0.93852824999999995, 0.24652238000000001, 1.1481637499999999, -0.70346624999999996],
 [0.83842508000000004, 1.0058 697599999999, -0.91267403000000002, 0.97991269000000003],
 [-1.4265273000000001, -0.73465904999999998, 0.6684284999999999 8, -0.21551155],
 [-1.1115614599999999, -1.0035033200000001, -0.11558254, -0.4339924],
 [1.8771354, -1.0189299199999999, - 0.84754008000000003, -0.35387946999999997]]

Using numpy.ndarray.copy :

>>> X.copy()
array([[ 0.6275735 , -0.64486081, -0.18372566,  0.78470704],
       [ 1.78209799, -1.33644846, -1.38514222, -0.49668994],
       [-0.84148266,  0.18864598, -1.11351513, -0.40225054],
       [ 0.93852825,  0.24652238,  1.14816375, -0.70346625],
       [ 0.83842508,  1.00586976, -0.91267403,  0.97991269],
       [-1.4265273 , -0.73465905,  0.6684285 , -0.21551155],
       [-1.11156146, -1.00350332, -0.11558254, -0.4339924 ],
       [ 1.8771354 , -1.01892992, -0.84754008, -0.35387947]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM