简体   繁体   中英

Combine rows into one using multiple columns in Python

Solved - Issue with the original dataset I was using

I have a large csv file of prescription data, The first column contains the Year issued, the second column contains the name of the chemical substance, third column the practice, 4 col number of items.

Year       Chemical substance     Practice   Items     
2019       Bisoprolol             Practice A 10         
2019       Bisoprolol             Practice B 12
2020       Bisoprolol             Practice A 13
2020       Bisoprolol             Practice B 15
2019       Omeprazole             Practice A 12
2019       Omeprazole             Practice B 12
2020       Omeprazole             Practice A 13
2020       Omeprazole             Practice B 15
2019       Tolteridone            Practice A 13
2019       Tolteridone            Practice B 14
2020       Tolteridone            Practice A 12
2020       Tolteridone            Practice B 12

I want to combine the data for the practice and year so that it will give me a total issued per practice, similar to this output,

Chemical substance    Practice    Items
Bisoprolol            Practice A  23
Bisoprolol            Practice B  27

I have tried groupby,

merged_df = prescribingdata_df.groupby(['Chemical substance', 'Practice']).agg('sum')

but I just get the data output the same as the original. Is there a way to combine both the rows based on 2 columns? So that the data for the year and the practice are shown?

try this brother, you forgot the ''

merged_df = prescribingdata_df.groupby(['Chemical substance', 'Practice']).agg('sum')

There appeared to be an error in the data as when I ran this again in another notebook it worked fine

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM