简体   繁体   中英

numpy returns 1d array and 2d array for same code

I am not really aware of what rules does numpy follows when performing some 2d array operations with regards to returning the result as a 1d or 2d array. Let us consider the following piece of code

idx_cls_samples = sample_data[:, -1] == c
v_feature = sample_data[idx_cls_samples, f]

f_values = sample_data[[sample_data[:, -1] == c], f]

Note that the last line is simply the first two lines combined into one.

The result of first two lines is a numpy vector of the form array([1, 2, 3, ...]) and the result of last line is array([[1, 2, 3, ...]]) and I believe the result should have been array([1], [2], [3], ...]) in both cases. How can I figure out beforehand what format will numpy choose to return the result?

Note that the last line is simply the first two lines combined into one.

No it's not. You stuck an extra pair of brackets in there:

f_values = sample_data[[sample_data[:, -1] == c], f]
#                      ^                       ^

Take them out.

As for the indexing rules, those are in the documentation . They're pretty long.

sample_data is 2d. sample_data[:,-1] is 1d, the last column. Indexing with a scalar removes a dimension.

The ...=c produces a boolean of the same dimension (1d).

sample_data[:, f] is also a 1d, the fth column.

Indexing that with a boolean array returns a result of the same dimension of the boolean, but just a subset of the values

sample_data[idx, f] is 1d, sample_data[[idx], f] is 2d (due to the added [] ).

You probably wanted, sample_data[(sample_data[:, -1] == c), f] , where () just groups the strings, sometimes for operator precedence, sometimes just to make more readable. (but beware of (...,) , which makes a tuple).

sample_data[idx, [f]] would have given you the column 'vector', 2d with 1 column.

Another way to look at sample_data[idx,f] is: idx selects a subset of rows, f selects a column from that 2d.

Often 2d (or higher nd) indexing can be studied axis by axis; that's especially true with an index is scalar, or a slice. It's more complicated if an index is a list or array, or worse, 2 or more of those.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM