简体   繁体   English

按条件提取numpy数组中的特定列

[英]Extracting specific columns in numpy array by condition

I have a homework assignment to extract a 2-dimensional numpy array out of another 2-dimensional np array by choosing specific columns by condition (not by range). 我有一个作业分配通过按条件(而不是按范围)选择特定的列来从另一个2维np数组中提取2维numpy数组。

So I have an array A with shape (3, 50000) . 所以我有一个形状为(3, 50000)的数组A I am trying to get a new array with shape (3, x) for some x < 50000 with the original columns of A that satisfy the third cell in the column is -0.4 < z < 0.1`. 我正在尝试为x < 50000 with the original columns of形状得到一个形状(3, x)的新数组,其中A x < 50000 with the original columns of that satisfy the third cell in the column is x < 50000 with the original columns of that satisfy the third cell in the column is -0.4 <z <0.1`。

For example if: 例如,如果:

A = [[1,2,3],[2,0.5,0],[9,-2,-0.2],[0,0,0.5]]

I wish to have back: 我希望回来:

B = [[2,0.5,0],[9,-2,-0.2]

I have tried to make a bool 1 rank array that holds true on the columns I want, and to some how combine between the two. 我试图制作一个布尔值1的秩数组,该数组在我想要的列上以及在两者之间如何组合时都适用。 The problem it's output is 1 rank array which is not what I am looking for. 它输出的问题是1级数组,这不是我想要的。 And I got some ValueErrors.. 我有一些ValueErrors ..

bool_idx = (-0.4 < x_y_z[2] < 0.1)

This code made some troubles: 这段代码带来了一些麻烦:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I can do it with some loops but NumPy got so many beautiful function I am sure I am missing something here.. 我可以通过一些循环来做到这一点,但是NumPy有很多漂亮的功能,我敢肯定我在这里错过了一些东西。

In Python, the expression -0.4 < x_y_z[2] < 0.1 is roughly equivalent to -0.4 < x_y_z[2] and x_y_z[2] < 0.1 . 在Python中,表达式-0.4 < x_y_z[2] < 0.1大致等于-0.4 < x_y_z[2] and x_y_z[2] < 0.1 The and operator decides the truth value of each part of the expression by converting it into a bool. and运算符通过将表达式的每个部分转换为布尔值来确定其真值。 Unlike Python lists and tuples, numpy arrays do not support the conversion. 与Python列表和元组不同,numpy数组不支持转换。

The correct way to specify the condition is with bitwise & (which is unambiguous and non-short-circuiting), rather than the implicit and (which short circuits and is ambiguous in this case): 指定条件的正确方法是按位& (明确且非短路),而不是隐式and (在这种情况下会短路且不明确):

condition = ((x_y_z[2, :] > - 0.4) & (x_y_z[2, :] < 0.1))

condition is a boolean mask that selects the columns you want. condition是一个布尔掩码,用于选择所需的列。 You can select the rows with a simple slice: 您可以选择带有简单切片的行:

selection = x_y_z[:, condition] 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM