[英]In a pandas dataframe, how do I check if two strings exist on same row but in different columns?
So I have been trying to figure out how to write the simplest if statement in order to check if the string "A" exist in the rootID and "B" exist in the parentID in any of the rows. 因此,我一直在尝试找出如何编写最简单的if语句,以便检查在任何行中rootID中是否存在字符串“ A”,在parentID中是否存在“ B”字符串。 I then want to remove that row.
然后,我想删除该行。 In the following dataframe I would have wanted to remove row 0 in that case.
在以下数据框中,我本想在这种情况下删除第0行。
rootID parentID jobID time
0 A B D 2019-01-30 14:33:21.339469
1 E F G 2019-01-30 14:33:21.812381
2 A C D 2019-01-30 15:33:21.812381
3 E E F 2019-01-30 15:33:21.812381
4 E F G 2019-01-30 16:33:21.812381
I know how to check if one element exists such as 我知道如何检查是否存在一个元素,例如
if df['rootID'].str.contains("A").any()
but how do I do it when I need to check for two different strings in two columns? 但是当我需要在两列中检查两个不同的字符串时该怎么办?
Use boolean indexing
with masks chained by |
使用
boolean indexing
和由|
链接的掩码 for bitwise OR
and ~
for invert boolean masks. 用于
bitwise OR
和~
用于反转布尔掩码。
If need check substrings: 如果需要检查子字符串:
m1 = ~df['rootID'].str.contains("A")
m2 = ~df['parentID'].str.contains("B")
If need check strings use Series.ne
: 如果需要检查字符串,请使用
Series.ne
:
m1 = df['rootID'].ne("A")
m2 = df['parentID'].ne("B")
#alternatives
#m1 = df['rootID'] != "A"
#m2 = df['parentID'] != "B"
df = df[m1 | m2]
print (df)
rootID parentID jobID time
1 E F G 2019-01-30 14:33:21.812381
2 A C D 2019-01-30 15:33:21.812381
3 E E F 2019-01-30 15:33:21.812381
4 E F G 2019-01-30 16:33:21.812381
Another solution: 另一个解决方案:
df = df.query('rootID != "A" | parentID != "B"')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.