Pandas dataframe, groupBy aggregate multiple columns and rows

Question

I have a pandas DataFrame that looks like this:

  supply_area transaction_date     price
0       54.98       2006-03-31   48500.0
0       54.98       2006-04-30   48500.0
0       54.98       2006-05-31   48500.0
1       67.28       2006-01-31   54500.0
1       67.28       2006-02-28   54500.0
1       67.28       2006-03-31   54500.0

and I would like to group by supply_area with a column that joins transaction_date and price to look like this:

  supply_area transaction_date_price     price
0       54.98       2006-03-31,48500.0,2006-04-30,48500.0,2006-05-31,48500.0
1       67.28       2006-01-31,54500.0,2006-02-28,54500.0,2006-03-31,54500.0

I have tried this and few other things but it does not work.

df = df.groupby('supply_area').agg(
                {'supply_area': 'first', 'transaction_date': ','.join, 'price': ','.join})

I'm pretty new to python and the pandas lib so I'm not sure if what I want is even possible.

Thanks in advance!

Answer 1

You can create an new column (here called "joined", but any name is fine) with the first concatenation and then concatenate on a groupby :

df['joined'] = (df['transaction_date'] + ',' + df['price'].astype(str))
df.groupby('supply_area', as_index=False)['joined'].apply(','.join)

output:

   supply_area                                              joined
0        54.98  2006-03-31,48500,2006-04-30,48500,2006-05-31,48500
1        67.28  2006-01-31,54500,2006-02-28,54500,2006-03-31,54500

Pandas dataframe, groupBy aggregate multiple columns and rows

Question

1 answers

solution1
0 ACCPTED 2021-07-16 09:11:40

Pandas dataframe, groupBy aggregate multiple columns and rows

Question

1 answers

solution1 0 ACCPTED 2021-07-16 09:11:40

solution1
0 ACCPTED 2021-07-16 09:11:40