I'm downloading data from FRED. I'm summing to get annual numbers, but don't want incomplete years. So I need a sum condition if count the number of obs is 12 because the series is monthly.
import pandas_datareader.data as web
mnemonic = 'RSFSXMV'
df = web.DataReader(mnemonic, 'fred', 2000, 2020)
df['year'] = df.index.year
new_df = df.groupby(["year"])[mnemonic].sum().reset_index()
print(new_df)
I don't want 2019 to show up.
In your case we using transform
with nunique
to make sure each year should have 12 unique month , if not we drop it before do the groupby
sum
df['Month']=df.index.month
m=df.groupby('year').Month.transform('nunique')==12
new_df = df.loc[m].groupby(["year"])[mnemonic].sum().reset_index()
isin
df['Month']=df.index.month
m=df.groupby('year').Month.nunique()
new_df = df.loc[df.year.isin(m.index[m==12)].groupby(["year"])[mnemonic].sum().reset_index()
You could use a aggreate function count
while groupby
:
df['year'] = df.index.year
df = df.groupby('year').agg({'RSFSXMV': 'sum', 'year': 'count'})
which will give you:
RSFSXMV year
year
2000 2487790 12
2001 2563218 12
2002 2641870 12
2003 2770397 12
2004 2969282 12
2005 3196141 12
2006 3397323 12
2007 3531906 12
2008 3601512 12
2009 3393753 12
2010 3541327 12
2011 3784014 12
2012 3934506 12
2013 4043037 12
2014 4191342 12
2015 4252113 12
2016 4357528 12
2017 4561833 12
2018 4810502 12
2019 2042147 5
Then simply drop those rows with a year count less than 12
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.