简体   繁体   English

根据Pandas python中的两个条件选择数据帧的行

[英]Selecting rows of a dataframe based on two conditions in Pandas python

I have a df, and I want to run something like: 我有一个df,我想运行类似的东西:

subsetdf= df.loc[(df['Item_Desc'].str.contains('X')==True) or \
                 (df['Item_Desc'].str.contains('Y')==True ),:]

that selects all rows that have the Item Desc column a substring of "X" or "Y". 选择具有Item Desc列的子串为“X”或“Y”的所有行。

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

I get the error when I run that. 我运行时遇到错误。 Any help? 有帮助吗?

Use | 使用| instead of or . 而不是or So: 所以:

df.loc[(cond1) | (cond2), :]

The or operator wants to compare two boolean values (or two expression that evaluate to True or False). or运算符想要比较两个布尔值(或两个计算结果为True或False的表达式)。 But a Series (or numpy array) does not simply evaluates to True or False, and in this case we want to compare both series element-wise. 但是Series(或numpy数组)并不简单地计算为True或False,在这种情况下,我们想要逐个元素地比较两个系列。 For this you can use | 为此你可以使用| which is called 'bitwise or'. 这被称为'按位或'。

Pandas follows here the numpy conventions. 熊猫遵循这些笨拙的惯例。 See here in the pandas docs for an explanation on it. 这里的大熊猫文档关于它的解释。

The condition should be as follows 条件应如下

df.loc[(cond1) | (cond2)]

Each condition has to be enclosed in parentheses as well. 每个条件也必须括在括号中。 High priority is given for parentheses than the bitwise 'OR' operator. 括号中的优先级高于按位“OR”运算符。 When the parentheses are not provided it would also give the same error 如果没有提供括号,它也会给出相同的错误

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM