简体   繁体   中英

Python: Sum multiple columns in a Data Frame by each unique row entry in another column

How do I sum multiple columns (eg columns C4, C5 and C6) by using each unique entry in another column (eg by column C2).

For example, I would like to create a new dataframe that would collapse column C2 by import and export and show also the sum of C4, C5 and C6 (and preferably drop the other columns C1 and C3).

Sample Table

You can do this using the pandas groupby function

df = pd.DataFrame([['A', 'Import', 'Argentina', 1, 1, 1],
                   ['B', 'Import', 'Brazil', 2, 2, 2],
                   ['C', 'Export', 'UJ', 3, 3, 3],
                   ['D', 'Export', 'US', 4, 4, 4],
                   ['A', 'Export', 'Canada', 5, 5, 5],
                   ['B', 'Export', 'Russia', 6, 6, 6],
                   ['C', 'Import', 'China', 7, 7, 7],
                   ['D', 'Import', 'India', 8, 8, 8]], 
                  columns=['C1', 'C2', 'C3', 'C4', 'C5', 'C6'])

results = df.groupby("C2").sum()

print(results)

Which will give you

        C4  C5  C6
C2                
Export  18  18  18
Import  18  18  18

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM