简体   繁体   中英

How to slice a numpy array by a list of column indices

I have the following (4x8) numpy array:

In [5]: z
Out[5]: 
array([['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
       ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
        4.41336e-06, 0.522107],
       ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
        1.28505e-12, 0.480883],
       ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
        0.307837]], dtype=object)

In [6]: z.shape
Out[6]: (4, 8)

What I want to do is to extract the 0th, 2nd and 4th column of the above array yielding (4 x 3) array that looks like this:

    array([['1A34', 0.0,  0.0],
           ['1A9N', 0.0456267,  0.331932],
           ['1AQ3', 0.0444479, 0.268581],
           ['1AQ4', 0.0177232,  0.308995]])

What's the way to do it? Note that the above indexes are just example. In actuality it can be very irregular, eg 0th, 3rd, 4th.

Use slicing:

>>> arr = np.array([['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
       ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
        4.41336e-06, 0.522107],
       ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
        1.28505e-12, 0.480883],
       ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
        0.307837]], dtype=object)
>>> arr[:,:5:2]
array([['1A34', 0.0, 0.0],
       ['1A9N', 0.0456267, 0.331932],
       ['1AQ3', 0.0444479, 0.268581],
       ['1AQ4', 0.0177232, 0.308995]], dtype=object)

If the column indices are irregular then you can do something like this:

>>> indices = [0, 3, 4]
>>> arr[:, indices]
array([['1A34', 1.0, 0.0],
       ['1A9N', 0.0539268, 0.331932],
       ['1AQ3', 0.201112, 0.268581],
       ['1AQ4', 0.363746, 0.308995]], dtype=object)

Note that there's a subtle but substantial difference between slicing (which is basic indexing ) and using a sequence for indexing (also known as advanced indexing or fancy indexing). When using a slice such as arr[:, :5:2] , no data is copied, and we get a view of the original array. This implies that mutating the result of arr[:, :5:2] will affect arr itself. With fancy indexing arr[:, [0, 3, 4]] is guaranteed to be a copy: this takes up more memory, and mutating this result will not affect arr .

You can access the columns of a numpy array in the following way:

array[:,column_number]

To get the array of specific columns you can do as follows:

z = array([[['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
   ['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
    4.41336e-06, 0.522107],
   ['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
    1.28505e-12, 0.480883],
   ['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
    0.307837]], dtype=object]) #your array here

op_array = array([ [z:,0], z[:,2], z[:,3] ])

The op_array will have the 0th, 2nd and 3rd columns as rows.

So you need to transpose it to get the output array in the desired format.

op_array.transpose()

op_array will now look as below:

op_array([['1A34', 0.0,  0.0],
       ['1A9N', 0.0456267,  0.331932],
       ['1AQ3', 0.0444479, 0.268581],
       ['1AQ4', 0.0177232,  0.308995])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM