简体   繁体   English

使用熊猫将过滤器列表应用于来自列表的数据框

[英]Apply a list of filters to a dataframe coming from a list using pandas

I have a list of columns to apply filters in a dataframe which comes from a list. 我有一个列列表,用于在来自列表的数据框中应用过滤器。 The filter values comes from another list. 筛选器值来自另一个列表。

Earlier when the lists were of fixed, I used the following statement to get the job done: 列表固定后,我使用以下语句来完成工作:

df_result= df[(df[filterfieldList[0]] == filterValuesList[0]) & (df[filterfieldList[1]] == filterValuesList[1]) & (df[filterfieldList[2]] == filterValuesList[2])]

But as the weeks progressed I get a new requirement that the filtering list would be dynamic and I couldn't figure out how to do that now. 但是随着时间的推移,我收到了一个新的要求,那就是过滤列表必须是动态的,现在我不知道该怎么做。 As in, sometimes, the filters list will have only 2 fields to filter, sometime 3 or 5. How to do the filtering in this situation? 例如,有时过滤器列表将只有2个字段要过滤,有时为3或5。在这种情况下如何进行过滤?

Sample Data: 样本数据:

A             B            C                   D                 E
Project 1        Org_1     Directory        MSTR           Configuration    
Project 1        Org_1     Directory        MSTR          Unable to Login
Project 1       Org_1   Desktop Software    MSTR             Configuration
Project 1      Org_1    Desktop Software    MSTR           Configuration]
Project 1      Org_1    Directory           MSTR          Unable to Login

I think need list comprehension for create mask s and then np.logical_and.reduce for reduce and last filter by boolean indexing : 我觉得需要列表修真创建mask秒,然后np.logical_and.reduce为减少和最后一个过滤器boolean indexing

filterfieldList = ['A','B','E']
filterValuesList = ['Project 1', 'Org_1', 'Unable to Login']

tups = zip(filterfieldList, filterValuesList)
df_result = df[np.logical_and.reduce([(df[i] == j) for i, j in tups])]
print (df_result)
           A      B          C     D                E
1  Project 1  Org_1  Directory  MSTR  Unable to Login
4  Project 1  Org_1  Directory  MSTR  Unable to Login

EDIT: 编辑:

If need combine multiple filters per rows: 如果需要,每行组合多个过滤器:

filterfieldList = ['A','B','E', 'E']
filterValuesList = ['Project 1', 'Org_1', 'Unable to Login', 'Configuration']

f = pd.DataFrame({'field': filterfieldList, 'val':filterValuesList})
f = f.groupby('field')['val'].apply(list)
print (f)
field
A                         [Project 1]
B                             [Org_1]
E    [Unable to Login, Configuration]
Name: val, dtype: object

df_result = df[np.logical_and.reduce([(df[i].isin(j)) for i, j in f.items()])]
print (df_result)
           A      B                 C     D                E
0  Project 1  Org_1         Directory  MSTR    Configuration
1  Project 1  Org_1         Directory  MSTR  Unable to Login
2  Project 1  Org_1  Desktop Software  MSTR    Configuration
4  Project 1  Org_1         Directory  MSTR  Unable to Login

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM