简体   繁体   中英

Row vs. Column Vector Return during Numpy Array Slicing

I am currently learning Python and stumbled across a result that confused me a little bit when performing basic array slicing commands.

I created a 4x5 matrix using this command:

>>> a = numpy.arange(20).reshape(4,5)

Which gives:

[[ 0, 1, 2, 3, 4],
 [ 5, 6, 7, 8, 9],
 [10,11,12,13,14],
 [15,16,17,18,19]]

If I index the array like this:

>>> a[0:3, 2]

I get a row vector:

[2, 7, 12]

But if I index the array like this:

>>> a[0:3, 2:3]

I get a column vector:

[[ 2],
 [ 7],
 [12]]

When plugging the two commands in I expected the results to be the same, so why am I getting different types of vectors?

Thank you!

tl;dr version

In numpy, taking a single index along a dimension from an array reduces the dimensionality by 1, so taking an index from a 2D array results in a 1D array (first case). Taken a slice along a dimension maintains the same dimensionality, even if the slice has a length of one, so taking a length-1 slice of a 2D array is still a 2D array (second case)

Detailed version

The issue is that the first result isn't a row vector, it a 1D array. When you take a single scalar index from a dimension, it reduces the number of dimensions by 1. So taking a scalar index from a 4D array makes it a 3D array, taking one from a 3D array makes it a 2D array, 2D array to 1D array, and 1D array to scalar.

This is for consistency sake. If taking an item from a 1D array makes a scalar (reduces dimensionality by one), then by extension higher-dimensional equivalent operation should behave in an equivalent way.

In the second case, you are taking a slice, not a scalar. When you do that, it keeps the number of dimensions. So taking a slice of a 2D array is always a 2D array, even if the slice is empty (or length 1 in your case). This is also for consistency. If a length-3 slice of a 2D array is a 2D array, and a length-2 slice of a 2D array is a 2D array, then a length-1 slice of a 2D array should also be a 2D array.

This is also a convenient convention, since it allows you to explicitly define in just a couple of characters whether you want to reduce dimensionality or not.

Some languages, like MATLAB, don't have the concept of a 1D array (or, technically, matrix), arrays can be 0D (scalars), 2D, 3D, etc., but not 1D. Python, on the other hand, allows for true 1D arrays, which can trip up people who aren't used to it.

If you use two slices, you get two dimensions. Although what you got in your second example is sometimes called a "column vector", it is really an Nx1 array of two dimensions. This is different from what you got in the first example, which is not two-dimensional at all but is a 1D array.

The fact that you used a slice rather than a single value is what causes the extra dimension. Numpy doesn't look at the number of elements that the slice actually grabs; it just looks at whether you used a slice or not. It would be even more confusing if a[0:3, 2:3] returned a 1D vector, but a[0:3, 1:3] returned a 2D array.

Why are different commands.

In the first case you get the first three rows and in them their third element. This will return you only the elements.

In the second case you get the first three lines and specifies that both the elements of the third column. This column will return it, or a vector with one element

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM