根据子串列表过滤 Pandas Dataframe

Question

I have a Pandas Dataframe containing multiple colums of strings.我有一个 Pandas Dataframe 包含多个字符串列。 I now like to check a certain column against a list of allowed substrings and then get a new subset with the result.我现在想根据允许的子字符串列表检查特定列，然后得到一个包含结果的新子集。

substr = ['A', 'C', 'D']
df = pd.read_excel('output.xlsx')
df = df.dropna()
# now filter all rows where the string in the 2nd column doesn't contain one of the substrings

The only approach I found was creating a List of the corresponding column an then do a list comprehension, but then I loose the other columns.我发现的唯一方法是创建相应列的列表，然后进行列表理解，但随后我松开了其他列。 Can I use list comprehension as part of eg df.str.contains() ?我可以使用列表理解作为例如df.str.contains()的一部分吗？

year  type     value   price
2000  ty-A     500     10000
2002  ty-Q     200     84600
2003  ty-R     500     56000
2003  ty-B     500     18000
2006  ty-C     500     12500
2012  ty-A     500     65000
2018  ty-F     500     86000
2019  ty-D     500     51900

expected output:预计 output：

year  type     value   price
2000  ty-A     500     10000
2006  ty-C     500     12500
2012  ty-A     500     65000
2019  ty-D     500     51900

Answer 1

You could use pandas.Series.isin 您可以使用pandas.Series.isin

>>> df.loc[df['type'].isin(substr)]
   year type  value  price
0  2000    A    500  10000
4  2006    C    500  12500
5  2012    A    500  65000
7  2019    D    500  51900

Answer 2

you could use pandas.DataFrame.any or pandas.DataFrame.all你可以使用pandas.DataFrame.any或pandas.DataFrame.all

if you want where all instances match如果你想要所有实例匹配的地方

df.loc[df['type'].apply(lambda x: all( word in x for word in substr)

or if you want any from the substr或者如果你想从 substr

df.loc[df['type'].apply(lambda x: any( word in x for word in substr)

That should if you print or return df a filtered list.如果您打印或返回 df 过滤列表，那应该。

根据子串列表过滤 Pandas Dataframe

问题描述

2 个解决方案

解决方案1
2 2019-09-04 09:58:31

解决方案2
0 2022-04-19 02:17:49

根据子串列表过滤 Pandas Dataframe

问题描述

2 个解决方案

解决方案1 2 2019-09-04 09:58:31

解决方案2 0 2022-04-19 02:17:49

解决方案1
2 2019-09-04 09:58:31

解决方案2
0 2022-04-19 02:17:49