In python pandas, I have a dataframe which looks something like this:
> df
count
date
2021-04-03 23.0
2021-04-04 12.0
2021-04-04 10.0
2021-04-05 42.0
2021-04-06 39.0
...
Some of the dates are repeated, with a different count value. I would like to merge these values into one row like this:
> df
count
date
2021-04-03 23.0
2021-04-04 22.0
2021-04-05 42.0
2021-04-06 39.0
..
If it's any help the data source is a CSV file. There is likely a way to do this in a for loop but I was wondering if this can be done with a function in pandas? Thanks.
You can group by the index and sum the values in this case
>>> result = df.groupby(df.index)['count'].sum()
>>> result
date
2021-04-03 23.0
2021-04-04 22.0
2021-04-05 42.0
2021-04-06 39.0
Name: count, dtype: float64
You can use groupby() :
new_df = df.groupby(['date']).sum()
As seen from your sample data, date
should be the row index instead of a data column. Therefore, you need extra step to convert the groupby()
and sum()
result (a Pandas series) back to a dataframe by .to_frame()
, as follows:
df.groupby('date')['count'].sum().to_frame(name='count')
Output:
count
date
2021-04-03 23.0
2021-04-04 22.0
2021-04-05 42.0
2021-04-06 39.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.