简体   繁体   中英

Merge Rows in pandas

I want to merge rows in my df so I have one unique row per ID/Name with other values either summed (revenue) or concatenated (subject and product).

My df is similar to this:

ID   Name   Revenue   Subject   Product
123  John   125       Maths     A
123  John   75        English   B
246  Mary   32        History   B
312  Peter  67        Maths     A
312  Peter  39        Science   C

I would like to merge the rows so the output looks like this:

ID   Name   Revenue   Subject        Product
123 John    200       Maths English  A B
246 Mary    32        History        B
312 Peter   106       Maths Science  A C

Try this:

df.groupby(['ID','Name']).agg(Revenue=('Revenue', 'sum'),
                              Subject=('Subject', " ".join),
                              Product=('Product', " ".join))\
  .reset_index()

Output:

|    |   ID | Name   |   Revenue | Subject       | Product   |
|----|------|--------|-----------|---------------|-----------|
|  0 |  123 | John   |       200 | Maths English | A B       |
|  1 |  246 | Mary   |        32 | History       | B         |
|  2 |  312 | Peter  |       106 | Maths Science | A C       |

Define a utility function as & use agg.

def f(x): return ' '.join(list(x))

df.groupby(['ID', 'Name']).agg( 
  {'Revenue': 'sum', 'Subject': f, 'Product': f} 
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM