简体   繁体   English

过滤数据由str.contains

[英]filter data by str.contains

I'm trying to filter my large data by columns that may contains the following strings 'io' and 'ir'. 我正在尝试按可能包含以下字符串'io'和'ir'的列过滤我的大数据。

df1 DF1

index  aio   bir   ckk
1      2     3     4
2      3     4     5

I want to create a new df with columns that contain 'io' and 'ir. 我想用包含'io'和'ir的列创建一个新的df。 The new df should look : 新的df应该看起来:

index  aio   bir  
1      2     3    
2      3     4     

I tried 我试过了

df = df[:, str.contains('io','ir')] 

but I got an error saying type object 'str' has no attribute 'contains' 但我得到一个错误说类型对象'str'没有属性'包含'

with pd.DataFrame.filter 使用pd.DataFrame.filter

df.filter(regex='i(o|r)')

       aio  bir
index          
1        2    3
2        3    4

If you have a list of things to match 如果你有一个匹配的东西列表

things = ['io', 'ir']
df.filter(regex='|'.join(things))

       aio  bir
index          
1        2    3
2        3    4

Alternatives 备择方案

df.filter(regex='io|ir')

df.loc[:, df.columns.str.contains('io|ir')]

Since you mention str.contains 既然提到了str.contains

df.loc[:,df.columns.str.contains('io|ir')]
Out[354]: 
       aio  bir
index          
1        2    3
2        3    4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM