简体   繁体   中英

Python Pandas Get a Cumulative Sum (cumsum) which excludes the current row

I am trying to get a cumulative count of a given column that excludes the current row in the dataframe.

My code is shown below. The problem with using cumsum() only is that it includes the current row in the count.

I want df['ExAnte Good Year Count'] to calculate cumsum on an ExAnte basis - ie. excluding the current row from the count.

d = {
      'Year':[2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008], 
      'Good Year':[1, 0, 1, 0, 0, 1, 1, 1, 0]
      'Year Type':['X', 'Y', 'Z', 'Z', 'Z', 'X', 'Y', 'Z', 'Z']
    }

df = pd.DataFrame(d, columns=['Year','Good Year'])
df['ExAnte Good Year Count'] = df['Good Year'].cumsum()

UPDATED QUERY: I would also like to count the cumsum of 'Good Years', grouped by Year Type. I have tried...

'df['Good Year'].groupby(['Year Type']).shift().cumsum()'

...but I get an error which says 'KeyError:'Year Type'

what about this one?

df['ExAnte Good Year Count'] = df['Good Year'].shift().cumsum()

The result should be the following:

   Year  Good Year  ExAnte Good Year Count
0  2000          1                     NaN
1  2001          0                     1.0
2  2002          1                     1.0
3  2003          0                     2.0
4  2004          0                     2.0
5  2005          1                     2.0
6  2006          1                     3.0
7  2007          1                     4.0
8  2008          0                     5.0
df['Yourcol']=df.groupby('Year Type',sort=False)['Good Year'].apply(lambda x : x.shift().cumsum())
df
Out[283]: 
   Good Year  Year Year Type  Yourcol
0          1  2000         X      NaN
1          0  2001         Y      NaN
2          1  2002         Z      NaN
3          0  2003         Z      1.0
4          0  2004         Z      1.0
5          1  2005         X      1.0
6          1  2006         Y      0.0
7          1  2007         Z      1.0
8          0  2008         Z      2.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM