简体   繁体   中英

Pandas - Merging rows based on certain columns and combine certain columns

I have a dataframe similar to:

State Organization Date    Tag
MD    ABC      01/10/2021  901
MD    ABC      01/10/2021  801
NJ    DEF      02/10/2021  701
NJ    DEF      02/10/2021  601
NJ    DEF      02/10/2021  701

I want to combine all rows where the state, organization, and date are the same. However, I want to take the tag column for each and make a list out of all the tags from the original rows in the new merged rows. So like:

State Organization Date    Tag
MD    ABC      01/10/2021  901, 801
NJ    DEF      02/10/2021  701, 601, 701

I'm thinking there definitely has to be an easy way to do this since as of now I'm doing a lot of work to achieve that using iterrows along with some other stuff. Suggestions?

Try this:

df.groupby(['State','Organization']).agg({'Date':'first','Tag':lambda x: ','.join(x.astype(str))})

Thanks to rhug123, with a slight modification I get the desired effect:

df.groupby(['State','Organization', 'Date']).agg({'Tag':lambda x: ','.join(x.astype(str))})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM