I am trying to do a filter based on one column. For example, I want to remove the transaction whenever I see it has account_no 1111
Input
Date Trans account_no
2017-12-11 10000 1111
2017-12-11 10000 1112
2017-12-11 10000 1113
2017-12-11 10001 1111
2017-12-11 10002 1113
Desired Output
Date Trans account_no
2017-12-11 10002 1113
Edit:
This is different than operator chaining because you are dealing with a duplication/conditional filter
By using issubset
+ transform
df[~df.groupby('Trans').account_no.transform(lambda x : set([1111]).issubset(x))]
Out[1658]:
Date Trans account_no
4 2017-12-11 10002 1113
You could do this in two steps. First find all Trans
values that have an account_no
that is ever equal to 1111 using .loc
. Then select all other transactions with isin()
df[~df.Trans.isin(df.loc[df.account_no == 1111,'Trans'])]
Date Trans account_no
4 2017-12-11 10002 1113
You can use .loc
to filter based on a series.
def complex_filter_criteria(x):
return x != 1111
df.loc[df['account_no'].apply(complex_filter_criteria)]
df['account_no'].apply(complex_filter_criteria)
will return a series of True/False
evaluations for each entry in the column account_no
. Then, when you pass that into df.loc
, it returns a dataframe consisting only of the rows corresponding to the True
evaluations from the series.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.