I want to merge rows in my df so I have one unique row per ID/Name with other values either summed (revenue) or concatenated (subject and product).
My df is similar to this:
ID Name Revenue Subject Product
123 John 125 Maths A
123 John 75 English B
246 Mary 32 History B
312 Peter 67 Maths A
312 Peter 39 Science C
I would like to merge the rows so the output looks like this:
ID Name Revenue Subject Product
123 John 200 Maths English A B
246 Mary 32 History B
312 Peter 106 Maths Science A C
Try this:
df.groupby(['ID','Name']).agg(Revenue=('Revenue', 'sum'),
Subject=('Subject', " ".join),
Product=('Product', " ".join))\
.reset_index()
Output:
| | ID | Name | Revenue | Subject | Product |
|----|------|--------|-----------|---------------|-----------|
| 0 | 123 | John | 200 | Maths English | A B |
| 1 | 246 | Mary | 32 | History | B |
| 2 | 312 | Peter | 106 | Maths Science | A C |
Define a utility function as & use agg.
def f(x): return ' '.join(list(x))
df.groupby(['ID', 'Name']).agg(
{'Revenue': 'sum', 'Subject': f, 'Product': f}
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.