Solved - Issue with the original dataset I was using
I have a large csv file of prescription data, The first column contains the Year issued, the second column contains the name of the chemical substance, third column the practice, 4 col number of items.
Year Chemical substance Practice Items
2019 Bisoprolol Practice A 10
2019 Bisoprolol Practice B 12
2020 Bisoprolol Practice A 13
2020 Bisoprolol Practice B 15
2019 Omeprazole Practice A 12
2019 Omeprazole Practice B 12
2020 Omeprazole Practice A 13
2020 Omeprazole Practice B 15
2019 Tolteridone Practice A 13
2019 Tolteridone Practice B 14
2020 Tolteridone Practice A 12
2020 Tolteridone Practice B 12
I want to combine the data for the practice and year so that it will give me a total issued per practice, similar to this output,
Chemical substance Practice Items
Bisoprolol Practice A 23
Bisoprolol Practice B 27
I have tried groupby,
merged_df = prescribingdata_df.groupby(['Chemical substance', 'Practice']).agg('sum')
but I just get the data output the same as the original. Is there a way to combine both the rows based on 2 columns? So that the data for the year and the practice are shown?
try this brother, you forgot the ''
merged_df = prescribingdata_df.groupby(['Chemical substance', 'Practice']).agg('sum')
There appeared to be an error in the data as when I ran this again in another notebook it worked fine
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.