[英]Filter DataFrame by regex and match condition
I have the following DataFrame:我有以下 DataFrame:
Periodicity周期性 | Answer回答 |
---|---|
M米 | Yes是的 |
M米 | YeS是的 |
Y是 | yeS是的 |
M米 | No不 |
Y是 | NO不 |
M米 | nO不 |
I need to filter the DataFrame to get the rows that have a Monthly (M) periodicity and have a positive (YES, YEs, Yes, YeS, and so on) answer.我需要过滤 DataFrame 以获取具有每月 (M) 周期并具有肯定(是、是、是、是等)答案的行。
I have tried filter it with the following code:我尝试使用以下代码对其进行过滤:
import pandas as pd
import re
data = {'Periodicity': ['M', 'Y', 'M', 'M', 'M', 'Y', 'M', 'M'],
'Answer': ['YES', 'Yes', 'YEs', 'NO', 'no', 'No', 'yeS', 'yeS']}
df = pd.DataFrame(data)
pat=r'^[Yy].*'
df_filter=df[df.Answer.str.contains(pat)]
But I dont know how to add another condition to filter the DataFrame to match the desired Periodicity.但我不知道如何添加另一个条件来过滤 DataFrame 以匹配所需的周期性。 Everytime I add another filter condition, I get the following error message:每次我添加另一个过滤条件时,都会收到以下错误消息:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all
(). ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all
()。
Try this:尝试这个:
df_filter=df[df.Answer.str.contains(pat) & df.Periodicity.str.contains('M')]
You can do this without regex (by using string lower
method and dataframe filtering:您可以在没有正则表达式的情况下执行此操作(通过使用 string lower
方法和 dataframe 过滤:
import pandas as pd
import re
data = {'Periodicity': ['M', 'Y', 'M', 'M', 'M', 'Y', 'M', 'M'],
'Answer': ['YES', 'Yes', 'YEs', 'NO', 'no', 'No', 'yeS', 'yeS']}
df = pd.DataFrame(data)
df = df[(df['Answer'].str.lower() == 'yes') & (df['Periodicity'] == 'M')] # do this
print(df)
output: output:
Periodicity Answer
0 M YES
2 M YEs
6 M yeS
7 M yeS
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.