简体   繁体   中英

Calculate average of every n rows in pandas and assign new labels to rows

I have the following dataframe:

d = {'quarter': ['1996q1', '1996q2', '1996q3', '1996q4', '1997q1', '1997q2', '1997q3', '1997q4', '1996q1', '1996q2', '1996q3', '1996q4', '1997q1', '1997q2', '1997q3', '1997q4'], 'country': ['USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'ITA', 'ITA', 'ITA', 'ITA', 'ITA', 'ITA', 'ITA', 'ITA'], 'value': [0, 0, 0, 0, 1, 1, 1, 1, 3, 3, 5, 0, 0, 0, 2, 2]}
df = pd.DataFrame(data=d)
df
    quarter country value
0   1996q1  USA      0
1   1996q2  USA      0
2   1996q3  USA      0
3   1996q4  USA      0
4   1997q1  USA      1
5   1997q2  USA      1
6   1997q3  USA      1
7   1997q4  USA      1
8   1996q1  ITA      3
9   1996q2  ITA      3
10  1996q3  ITA      5
11  1996q4  ITA      0
12  1997q1  ITA      0
13  1997q2  ITA      0
14  1997q3  ITA     2
15  1997q4  ITA     2

I wish to compute the annual mean for every 4 quarters' values, while altering the country and quarter columns accordingly, to give:

d2 = {'year': ['1996', '1997', '1996',  '1997',], 'country': ['USA', 'USA', 'ITA', ' ITA '], 'value': [0, 1, 2.75, 1]}
df2 = pd.DataFrame(data=d2)
df2

    year    country value
0   1996    USA    0.00
1   1997    USA    1.00
2   1996    ITA    2.75
3   1997    ITA    1.00

First, convert your quarter column to pd.PeriodIndex . To get year from Period use the acessor Period.year . Then groupby year + country .

df.assign(year=pd.PeriodIndex(df.quarter, freq="Q").year).groupby(
    ["year", "country"], as_index=False, sort=False
)["value"].mean()

   year country  value
0  1996     USA   0.00
1  1997     USA   1.00
2  1996     ITA   2.75
3  1997     ITA   1.00

You can groupby multiple series at once. In your case, you need two series - one for year which can be obtained from quarter and another for country .

grp = [pd.to_datetime(df.quarter.rename("year")).dt.strftime("%Y"), df.country.str.strip()]
print(df.groupby(grp).value.mean().reset_index())
#    year country  value
# 0  1996     ITA   2.75
# 1  1996     USA   0.00
# 2  1997     ITA   1.00
# 3  1997     USA   1.00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM