简体   繁体   中英

How to do a complicated filter in python

I am trying to do a filter based on one column. For example, I want to remove the transaction whenever I see it has account_no 1111

Input

Date        Trans   account_no
2017-12-11  10000   1111
2017-12-11  10000   1112
2017-12-11  10000   1113
2017-12-11  10001   1111
2017-12-11  10002   1113

Desired Output

Date        Trans   account_no
2017-12-11  10002   1113

Edit:

This is different than operator chaining because you are dealing with a duplication/conditional filter

By using issubset + transform

df[~df.groupby('Trans').account_no.transform(lambda x : set([1111]).issubset(x))]
Out[1658]: 
         Date  Trans  account_no
4  2017-12-11  10002        1113

You could do this in two steps. First find all Trans values that have an account_no that is ever equal to 1111 using .loc . Then select all other transactions with isin()

df[~df.Trans.isin(df.loc[df.account_no == 1111,'Trans'])]

         Date  Trans  account_no
4  2017-12-11  10002        1113

You can use .loc to filter based on a series.

def complex_filter_criteria(x):
    return x != 1111
df.loc[df['account_no'].apply(complex_filter_criteria)]

df['account_no'].apply(complex_filter_criteria) will return a series of True/False evaluations for each entry in the column account_no . Then, when you pass that into df.loc , it returns a dataframe consisting only of the rows corresponding to the True evaluations from the series.

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM