I have dataframe df
with following characteristic
store_id | city_id | sales_A | sales_B | sales_C |
---|---|---|---|---|
STORE01 | CITY99 | 100 Item | None | None |
STORE01 | CITY99 | None | 200 Order | None |
STORE01 | CITY99 | None | None | 300 Client |
STORE01 | CITY99 | 150 Order | None | 300 Client |
... |
All rows will has same characteristics, where same store id
and city ID
has 1 row or more:
Note that the value is not number, they are string, and must be kept as string
Ordering of rows might be different, but basically each has 1 or more rows, depends on sales.
In pandas,how can I merge them into one row, so the result dataset will be something like this:
store_id | city_id | sales_A | sales_B | sales_C |
---|---|---|---|---|
STORE01 | CITY99 | 100 Item, 150 Order | 200 Order | 300 Client |
Thanks
Use custom lambda function with remove None
values and duplicates, last join values by ,
in GroupBy.agg
:
#if None are strings convert them to NoneType
#df = df.mask(df == 'None', None)
f = lambda x: ', '.join(x.dropna().unique())
df = df.groupby(['store_id','city_id'], as_index=False).agg(f)
print (df)
store_id city_id sales_A sales_B sales_C
0 STORE01 CITY99 100 Item, 150 Order 200 Order 300 Client
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.