简体   繁体   中英

How to apply multiple custom functions on multiple columns in grouped DataFrame in pandas?

I have a pandas DataFrame which is grouped by p_id . The goal is to get a DataFrame with data shown under 'Output I'm looking for'. I've tried a few things, but I am struggling applying two custom aggregated functions:

  • apply(list) for x_id
  • '||'.join for x_name .

How can I solve this problem?

Input

| p_id | x_id | x_name |
|------|------|--------|
| 1    | 4    | Text   |
| 2    | 4    | Text   |
| 2    | 5    | Text2  |
| 2    | 6    | Text3  |
| 3    | 4    | Text   |
| 3    | 7    | Text4  |

Output I'm looking for

| p_id | x_ids   | x_names            |
|------|---------|--------------------|
| 1    | [4]     | Text               |
| 2    | [4,5,6] | Text||Text2||Text3 |
| 3    | [4,7]   | Text||Text4        |

You can certainly do:

df.groupby('pid').agg({'x_id':list, 'x_name':'||'.join})

Or a little more advanced with named agg:

df.groupby('pid').agg(x_ids=('x_id',list),
                      x_names=('x_name', '||'.join))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM