简体   繁体   English

如何在 numpy 中保持向量的行/列方向?

[英]How do I maintain row/column orientation of vectors in numpy?

Coming from a background of Matlab/Octave, I have been trying to learn numpy.来自 Matlab/Octave 的背景,我一直在努力学习 numpy。 One thing that has been tripping me up over and over is the distinction between vectors and multi-dimensional arrays.一直困扰着我的一件事是向量和多维数组之间的区别。 For this question I'll give a specific problem I'm having, but I'd be much obliged if someone could also explain the more general picture behind single-dimensional arrays in numpy, why you would want them in the first place, how to avoid trouble when mixing single and multi-dimensional arrays, etc. Anyway, the question:对于这个问题,我将给出我遇到的一个具体问题,但如果有人也能解释 numpy 中一维数组背后的更一般的图景,我将非常感激,为什么你首先想要它们,如何以避免在混合单维和多维数组等时遇到麻烦。无论如何,问题是:

I have a 2-D array called X:我有一个名为 X 的二维数组:

X = numpy.arange(10).reshape(2,5)

and I want to take the last column of X and store it as another 2-D array (ie, a column vector) called Y. The only way I have been able to come with for this is:我想取 X 的最后一列并将其存储为另一个名为 Y 的二维数组(即列向量)。我能够为此提供的唯一方法是:

Y = numpy.atleast_2d(X[:,4]).T

but I don't like that for a couple of reasons:但我不喜欢这样有几个原因:

  1. I don't feel like I should have to tell it to transpose the vector when the orientation should be implied in X[:,4].当方向应该隐含在 X[:,4] 中时,我不觉得我应该告诉它转置向量。

  2. Using atleast_2D just seems so cumbersome to use over and over again in code where this situation would come up a lot.使用 atleast_2D 在代码中反复使用似乎很麻烦,因为这种情况会出现很多。 It feels like I'm doing something wrong.感觉就像我做错了什么。

So, in short, is there a better way?那么,简而言之,有没有更好的方法?

Thanks.谢谢。

First, the easy way to do what you want: 首先,简单的方法来做你想要的:

Y = X[:,4:]

Now, the reason numpy wasn't doing this when you were trying it before has to do with how arrays work in Python, and actually in most programming languages. 现在,numpy在你之前尝试它时没有这样做的原因与数组如何在Python中工作有关,实际上在大多数编程语言中。 When you write something like a[4] , that's accessing the fifth element of the array, not giving you a view of some section of the original array. 当你写一个类似a[4]东西时,它正在访问数组的第五个元素,而不是给你一个原始数组的某个部分的视图。 So for instance, if a is an array of numbers, then a[4] will be just a number. 因此,例如,如果a是数字数组,则a[4]将只是一个数字。 If a is a two-dimensional array, ie effectively an array of arrays, then a[4] would be a one-dimensional array. 如果a是二维数组,即有效地是数组数组,那么a[4]将是一维数组。 Basically, the operation of accessing an array element returns something with a dimensionality of one less than the original array. 基本上,访问数组元素的操作返回的维度比原始数组小1。

Now, Python includes this thing called "slice notation," represented using the colon, which is a different way of accessing array elements. 现在,Python包含了一个名为“切片表示法”的东西,用冒号表示,这是一种访问数组元素的不同方式。 Instead of returning an element (something with a dimensionality of one less than the original array), it returns a copy of a section of the original array. 它返回原始数组的一部分副本,而不是返回一个元素 (维数比原始数组小一些)。 Essentially, a:b represents the list of all the elements at indices a (inclusive) to b (exclusive). 本质上, a:b表示索引a (包括)到b (不包括)的所有元素的列表。 Either a or b or both can be omitted, in which case the slice goes all the way to the corresponding end of the array. 可以省略ab或两者,在这种情况下,切片一直到达数组的相应末尾。

What this means for your case is that when you write X[:,4] , you have one slice notation and one regular index notation. 这对你的情况意味着当你写X[:,4] ,你有一个切片表示法和一个常规索引表示法。 The slice notation represents all indices along the first dimension (just 0 and 1, since the array has two rows), and the 4 represents the fifth element along the second dimension. 切片表示法表示沿第一维度的所有索引(仅为0和1,因为数组具有两行),而4表示沿第二维度的第五个元素。 Each instance of a regular index basically reduces the dimensionality of the returned object by one, so since X is a 2D array, and there is one regular index, you get a 1D result. 常规索引的每个实例基本上将返回对象的维度减少一个,因此由于X是一个2D数组,并且有一个常规索引,因此得到一维结果。 Numpy just displays 1D arrays as row vectors. Numpy只显示1D数组作为行向量。 The trick, if you want to get out something of the same dimensions you started with, is then to use all slice indices, as I did in the example at the top of this post. 诀窍,如果你想得到你开始使用的相同尺寸,然后使用所有切片索引,就像我在本文顶部的示例中所做的那样。

If you wanted to extract the fifth column of something that had more than 5 total columns, you could use X[:,4:5] . 如果要提取总列数超过5列的第五列,可以使用X[:,4:5] If you wanted a view of rows 3-4 and columns 5-7, you would do X[3:5,5:8] . 如果你想要一个3-4行和5-7行的视图,你会做X[3:5,5:8] Hopefully you get the idea. 希望你明白了。

Subsetting子集

The even simpler way is to subset the matrix.更简单的方法是对矩阵进行子集化。

>>> X
[[0 1 2 3 4]
 [5 6 7 8 9]]

>>> X[:, [4]]
[[4]
 [9]]

>>> X[:, 4]
[4 9]

It works somewhat similarly to a Pandas dataframe.它的工作原理有点类似于 Pandas 数据框。 If you index the dataframe, it gives you a Series .如果您索引数据框,它会给您一个Series If you subset or slice the dataframe, it gives you a dataframe .如果您对数据帧进行子集化或切片,它会为您提供一个dataframe

See also也可以看看

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM