![](/img/trans.png)
[英]How to filter a column with list of elements in Python dataframe
[英]How to filter a column in dataframe starting with integers in Python?
这里的问题在于您在contains()
函数中使用的表达式。 不是将 '15+' 视为字符序列,而是将其视为正则表达式。 因此它同时符合这两个条件。
函数定义: Series.str.contains(pat, case=True, flags=0, na=nan, regex=True)
Parameter :
pat : Character sequence or regular expression.
case : If True, case sensitive.
flags : Flags to pass through to the re module, e.g. re.IGNORECASE.
na : Fill value for missing values.
regex : If True, assumes the pat is a regular expression.
Returns : Series or Index of boolean values
以下是您可以执行的操作:
import pandas as pd
# Making a toy data-set.
data={'Category':['Age','Age','Age','Age','Age'],'Age':['15+','<15','15+','<15','15+']}
df= pd.DataFrame(data=data)
print(df)
# Output:
Category Age
0 Age 15+
1 Age <15
2 Age 15+
3 Age <15
4 Age 15+
这是重要的部分:
df_new=df[df['Age'].str.contains('15+', na = False,regex=False)]
# Tell contains() to not consider the expression as a regex by default.
print(df_new)
# Output:
Category Age
0 Age 15+
2 Age 15+
4 Age 15+
或者
df_new=df[df['Age'].str.contains(r'(\d{2}\+)', na = False)]
# the above regex matches a group in which two digits should be followed by a +
print(df_new)
# Output:
Category Age
0 Age 15+
2 Age 15+
4 Age 15+
这里有一些东西可以阅读以供进一步参考:
希望这有帮助,干杯!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.