简体   繁体   English

过滤列包含所有 substring

[英]Filter columns contains all substring

i am trying to select all crispy chicken sandwich in datasets, i have tried using this regex but it still got some grilled chicken sandwich.我正在尝试 select 数据集中的所有脆皮鸡肉三明治,我尝试使用这个正则表达式,但它仍然有一些烤鸡肉三明治。 Here is the code这是代码

data_sandwich_crispy = data[data['Item'].str.contains(r'^(?=.*crispy)(?=.*sandwich)(?=.*chicken)', regex=True)] data_sandwich_crispy = data[data['Item'].str.contains(r'^(?=.*crispy)(?=.*sandwich)(?=.*chicken)', regex=True)]

and here is the look of datasets这是数据集的外观

any revision, or link to answer is really appreciated.非常感谢任何修订或答案链接。 i'm really sorry if there was a mistake, thanks you for all your help!如果有错误真的很抱歉,谢谢大家的帮助!

This would be my solution.这将是我的解决方案。 It looks for strings where the word Crispy is followed by the word Chicken that is followed by the word Sandwich.它查找单词 Crispy 后跟单词 Chicken 和单词 Sandwich 的字符串。 However, there can be an arbitrary number of spaces or any other characters in between.但是,中间可以有任意数量的空格或任何其他字符。

# some data
l = ["Crispy Chicken Sandwich", 
     "Grilled Chicken Sandwich", 
     "crispy Chicken Sandwich"]
data = pd.DataFrame(l, columns=["A"])
data
#       A
# 0     Crispy Chicken Sandwich
# 1     Grilled Chicken Sandwich
# 2     crispy Chicken Sandwich


# consider `case`
data[data['A'].str.contains(r'Crispy.+Chicken.+Sandwich', regex=True, case=False)]
#       A
# 0     Crispy Chicken Sandwich
# 2     crispy Chicken Sandwich

If you meant collecting all rows containing crispy chicken sandwhich only, then have a look at this alternative solution below.如果您的意思是只收集所有包含crispy chicken sandwhich的行,请查看下面的替代解决方案。 This will return rows only when all three words (crispy, chicken and classic) are present:只有当所有三个单词(crispy、chicken 和 classic)都存在时,这才会返回行:

data_sandwich_crispy = df[df['item'].str.contains(r'^(?=.*?\bcrispy\b)(?=.*?\bchicken\b)(?=.*?\bclassic\b).*$',regex=True)]

I created a simple dataframe as shown below:我创建了一个简单的 dataframe 如下所示:

item    id
premium crispy chicken classic sandwhich    10
premium grilled chicken classic sandwhich   15
premium club chicken classic sandwhich      14

running the command given above gives the following output:运行上面给出的命令给出以下 output:

item    id
premium crispy chicken classic sandwhich    10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何过滤包含Pandas DataFrame Python中传递列表中所有substring的df列中的值? - How to filter values in columns of df that contains all substring in passed list in Pandas DataFrame Python? 通过匹配Pandas df中所有列中的子字符串来过滤所有行 - Filter all that rows by matching a substring in all the columns in Pandas df 如果字符串包含某些子字符串,则按条件过滤 - Filter by condition if string contains certain substring 当一列包含特定子字符串时,在多列上剪切子字符串 - Cut substring on multiple columns when one column contains a particular substring 如果包含 *,则在列上过滤 Pandas 数据框 - Filter pandas dataframe on columns if contains * 如何检查子字符串是否包含“苹果”中的所有字母 - How to check if a substring contains all of the letters in “apple” Pandas 过滤器 dataframe 列通过 substring 匹配 - Pandas filter dataframe columns through substring match 熊猫-合并所有具有相同子字符串的所有列 - Pandas - merge all all columns with the same substring 按包含列表中 substring 的字段过滤 sqlalchemy 查询 - Filter a sqlalchemy query by field that contains a substring from a list 根据是否包含substring过滤Pyspark Dataframe列 - Filter Pyspark Dataframe column based on whether it contains or does not contain substring
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM