简体   繁体   中英

Cumulatively merge rows with the same index

In python pandas, I have a dataframe which looks something like this:

> df
               count
date                
2021-04-03  23.0
2021-04-04  12.0
2021-04-04  10.0
2021-04-05  42.0
2021-04-06  39.0
...

Some of the dates are repeated, with a different count value. I would like to merge these values into one row like this:

> df
               count
date                
2021-04-03  23.0
2021-04-04  22.0
2021-04-05  42.0
2021-04-06  39.0
..

If it's any help the data source is a CSV file. There is likely a way to do this in a for loop but I was wondering if this can be done with a function in pandas? Thanks.

You can group by the index and sum the values in this case

>>> result = df.groupby(df.index)['count'].sum()
>>> result
date
2021-04-03    23.0
2021-04-04    22.0
2021-04-05    42.0
2021-04-06    39.0
Name: count, dtype: float64

You can use groupby() :

new_df = df.groupby(['date']).sum()

As seen from your sample data, date should be the row index instead of a data column. Therefore, you need extra step to convert the groupby() and sum() result (a Pandas series) back to a dataframe by .to_frame() , as follows:

df.groupby('date')['count'].sum().to_frame(name='count')

Output:

            count
date             
2021-04-03   23.0
2021-04-04   22.0
2021-04-05   42.0
2021-04-06   39.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM