I have a pandas (multi-index) dataframe as follows:
date Volume
Account ID
10001 2 02-03-2017 0
3 02-03-2017 0
3 09-03-2017 0
3 16-03-2017 50
3 21-03-2017 65
3 28-03-2017 0
3 04-04-2017 0
3 11-04-2017 60
10002 5 02-03-2017 14.5
6 09-03-2017 14.5
3 09-03-2017 0
3 16-03-2017 0
3 21-03-2017 20
3 28-03-2017 33
10003 8 21-03-2017 14.5
9 28-03-2017 15.0
Now I want to delete all rows at the beginning of a series (dates of an account-product combination) with volume 0. So I want to keep the rows with volume 0 in case they are in the middle or at the end of a series.
So in the above example, I'd want the following output:
date Volume
Account ID
10001 3 16-03-2017 50
3 21-03-2017 65
3 28-03-2017 0
3 04-04-2017 0
3 11-04-2017 60
10002 5 02-03-2017 14.5
6 09-03-2017 14.5
3 21-03-2017 20
3 28-03-2017 33
10003 8 21-03-2017 14.5
9 28-03-2017 15.0
Currently, I've been removing complete series with a filter, eg
df = data.groupby(level = acc_prod).filter(lambda x: len(x) > 26)
And I've seen examples of removing only the first row; Python: Pandas - Delete the first row by group . Yet I do not know how to only delete rows of zero at the beginning of an account-product series.
Would be great if someone could help me out on this!
You can use boolean indexing
with mask
created by groupby
with cumsum
and find values which are not 0
:
print (df.groupby(level=['Account','ID'])['Volume'].cumsum())
Account ID
10001 2 0.0
3 0.0
3 0.0
3 50.0
3 115.0
3 115.0
3 115.0
3 175.0
10002 5 14.5
6 14.5
3 0.0
3 0.0
3 20.0
3 53.0
10003 8 14.5
9 15.0
Name: Volume, dtype: float64
mask = df.groupby(level=['Account','ID'])['Volume'].cumsum() != 0
#!= is same as ne function
#mask = df.groupby(level=['Account','ID'])['Volume'].cumsum().ne(0)
print (mask)
Account ID
10001 2 False
3 False
3 False
3 True
3 True
3 True
3 True
3 True
10002 5 True
6 True
3 False
3 False
3 True
3 True
10003 8 True
9 True
Name: Volume, dtype: bool
print (df[mask])
date Volume
Account ID
10001 3 16-03-2017 50.0
3 21-03-2017 65.0
3 28-03-2017 0.0
3 04-04-2017 0.0
3 11-04-2017 60.0
10002 5 02-03-2017 14.5
6 09-03-2017 14.5
3 21-03-2017 20.0
3 28-03-2017 33.0
10003 8 21-03-2017 14.5
9 28-03-2017 15.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.