简体   繁体   中英

Aggregate data with two conditions

I have a data frame that looks something like this:

df =

date        name      val1     val2
-----------------------------------
14:55:00    name1     1        2
14:55:00    name1     2        4
15:00:00    name2     3        6
15:00:00    name3     4        8
15:05:00    name4     5        10
15:05:00    name5     6        12

What I would like to do is aggregate the data if the dates are the same - but only if the name is different. So the above data frame should actually become:

df_new =

date        name            val1     val2
-----------------------------------------
15:00:00    name2+name3     7        14
15:05:00    name4+name5     11       22

Currently I am almost doing it with:

df_new = df.groupby("date", as_index=False).agg({"name" : "+".join, "val1" : "sum", "val2" : "sum"})

However, this will also aggregate the ones where the name is the same, which it shouldn't. EDIT: It should also be noted that there are only a few different names. The names will be repeated in each date-interval. It's just that when the dates are aggregated the names can't be the same.

Can this be fixed?

Look for the duplicates, drop them, and then aggregate on the date column:

(df.drop_duplicates(subset=['date', 'name'], 
                    keep=False)
   .groupby('date')
   .sum()
  )
 
          val1  val2
date                
15:00:00     7    14
15:05:00    11    22

You can use:

(df.loc[~df.duplicated(subset=['date', 'name'], keep=False)]
   .groupby('date', as_index=False)
   .agg({"name" : "+".join, "val1" : "sum", "val2" : "sum"})
)



       date         name  val1  val2
0  15:00:00  name2+name3     7    14
1  15:05:00  name4+name5    11    22

Here, we firstly get rid of those entries that we don't want to aggregate: same date and name. We still keep the duplicated names if the dates are different.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM