Pandas merge several rows with different columns into one row

Question

I have dataframe df with following characteristic

store_id	city_id	sales_A	sales_B	sales_C
STORE01	CITY99	100 Item	None	None
STORE01	CITY99	None	200 Order	None
STORE01	CITY99	None	None	300 Client
STORE01	CITY99	150 Order	None	300 Client
...

All rows will has same characteristics, where same store id and city ID has 1 row or more:

row 1: sales A has value, other None
row 2: sales B has value, other None
row 3: sales C has value, other None
row 4: sales A has value (but different with row 1), other None

Note that the value is not number, they are string, and must be kept as string

Ordering of rows might be different, but basically each has 1 or more rows, depends on sales.

In pandas,how can I merge them into one row, so the result dataset will be something like this:

store_id	city_id	sales_A	sales_B	sales_C
STORE01	CITY99	100 Item, 150 Order	200 Order	300 Client

Thanks

Answer 1

Use custom lambda function with remove None values and duplicates, last join values by , in GroupBy.agg :

#if None are strings convert them to NoneType
#df = df.mask(df == 'None', None)


f = lambda x: ', '.join(x.dropna().unique())
df = df.groupby(['store_id','city_id'], as_index=False).agg(f)
print (df)
  store_id city_id              sales_A    sales_B     sales_C
0  STORE01  CITY99  100 Item, 150 Order  200 Order  300 Client

Pandas merge several rows with different columns into one row

Question

1 answers

solution1
2 ACCPTED 2021-08-16 06:55:58

Pandas merge several rows with different columns into one row

Question

1 answers

solution1 2 ACCPTED 2021-08-16 06:55:58

solution1
2 ACCPTED 2021-08-16 06:55:58