简体   繁体   中英

Adding Multiple Columns in Single groupby in Pandas

Dataset image

Please help, I have a dataset in which I have columns Country, Gas and Year from 2019 to 1991. Also attaching the snapshot of the dataset. I want to answer a question that I want to add all the values of a country column wise? For example, for Afghanistan, value should come 56.4 under 2019 (adding 28.79 + 6.23 + 16.37 + 5.01 = 56.4). Now I want it should calculate the result for every year. I have used below code for achieving 2019 data.

df.groupby(by='Country')['2019'].sum() 

This is the output of that code:

Country
---------------------
Afghanistan     56.40
Albania         17.31
Algeria        558.67
Andorra          1.18
Angola         256.10
                ...  
Venezuela      588.72
Vietnam        868.40
Yemen           50.05
Zambia         182.08
Zimbabwe       235.06

I have group the data country wise and adding the 2019 column values, but how should I add values of other years in single line of code?

Please help.

I can do the code shown here, to add rows and show multiple columns like this but this will be tedious task to do so write each column name.

df.groupby(by='Country')[['2019','2018','2017']].sum() 

If you don't specify the column, it will sum all the numeric column.

df.groupby(by='Country').sum() 

                 2019   2020   ...
Country
Afghanistan     56.40   32.4   ...
Albania         17.31   12.5   ...
Algeria        558.67  241.5   ...
Andorra          1.18    1.5   ...
Angola         256.10   32.1   ...
                ...      ...   ...
Venezuela      588.72  247.3   ...
Vietnam        868.40  323.5   ...
Yemen           50.05   55.7   ...
Zambia         182.08   23.4   ...
Zimbabwe       235.06  199.4   ...

Do a reset_index() to flatten the columns

df.groupby(by='Country').sum().reset_index()

Country          2020   2019   ...
Afghanistan     56.40   32.4   ...
Albania         17.31   12.5   ...
Algeria        558.67  241.5   ...
Andorra          1.18    1.5   ...
Angola         256.10   32.1   ...
                ...      ...   ...
Venezuela      588.72  247.3   ...
Vietnam        868.40  323.5   ...
Yemen           50.05   55.7   ...
Zambia         182.08   23.4   ...
Zimbabwe       235.06  199.4   ...

You can select columns keys in your dataframe starting from column 2019 till the last column key in this way:

df.groupby(by='Country')[df.keys()[2:]].sum() 

Method df.keys will return all dataframe columns keys in a list then you can slice it from the index of 2019 key which is 2 till end of columns keys.

Suppose you want to select columns from 2016 till 1992 column:

df.groupby(by='Country')[df.keys()[5:-1]].sum() 

you just need to slice the list of columns keys in correct index order.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM