简体   繁体   中英

Pandas: merge dataframe rows and take an average of the second column values

I have a two-column data frame with the first column containing a date (yyyy-mm-dd) and the second a rating out of five, ie '1' or '2' etc. The df is in order by date, with dates descending from the first row.

I am looking for a way to merge the rows containing identical date values - ie all of the 2021-05-05 and then take the average of all of the rating values for that given date to provide the corresponding rating average for that date.

For instance, if my df looks like this:

    Date        Rating
0  2021-05-05   1
1  2021-05-05   3
2  2021-05-05   2
3  2021-05-04   4
4  2021-05-04   6
5  2021-05-04   5

I want to merge it so it becomes like so:

    Date        Rating
0  2021-05-05   2
1  2021-05-04   5

You can use df.groupby() with sort=False (to maintain original sorting sequence) and as_index=False (to keep Date column as data column instead of row index). Use built-in aggregate function .mean() to get the average.

df.groupby('Date', as_index=False, sort=False)['Rating'].mean()

Output:

         Date  Rating
0  2021-05-05       2
1  2021-05-04       5

You can try pandas .groupby() and .mean()

result = df.groupby('Date', sort=False)['Rating'].mean().reset_index()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM