[英]Use Sum() and Groupby on multiple columns
Even though the code works and bring me back the needed results but I believe there is a easier way.即使代码有效并带回了所需的结果,但我相信有一种更简单的方法。
dfg = df.groupby('County')['Total N, 1985 (Kg.)' , 'Total N, 2007 (Kg.)' ,'Total N, 2009 (Kg.)','Total
N, 2010 (Kg.)','Total N, 2011 (Kg.)','Total N, 2012 (Kg.)','Total N, 2013 (Kg.)','Total N, 2014
(Kg.)','Total N, 2015 (Kg.)','Total N, 2016 (Kg.)','Total N, 2017 (Kg.)','Total N target, 2025
(Kg.)'].agg('sum')
The columns from 'Total N, 2011 (Kg.)' to ''Total N target, 2025 (Kg.)'' can be sliced using loc (maybe) however, I am stuck here for hours.从 'Total N, 2011 (Kg.)' 到 ''Total N target, 2025 (Kg.)'' 的列可以使用 loc (也许)进行切片,但是,我在这里被困了几个小时。
( column numbers in the data set are from 6 to 12) (数据集中的列号从 6 到 12)
This can be done by limiting the data by loc or other means after grouping and aggregating.这可以通过在分组和聚合后通过 loc 或其他方式限制数据来完成。 The sample data has been appropriated from the official plotly website.
样本数据取自官方 plotly 网站。
import plotly.express as px
import pandas as pd
df = px.data.gapminder()
df['year'] = df['year'].apply(lambda x:str(x)+'y')
df = df.pivot(index=['continent','country'], columns='year', values='gdpPercap')
df.columns
Index(['1952y', '1957y', '1962y', '1967y', '1972y', '1977y', '1982y', '1987y',
'1992y', '1997y', '2002y', '2007y'],
dtype='object', name='year')
dfg = df.groupby('continent').sum().loc[:,slice('1997y','2007y')]
dfg
year 1997y 2002y 2007y
continent
Africa 123695.496865 135168.028262 160629.695446
Americas 222232.521564 232191.927683 275075.790634
Asia 324525.078743 335744.983087 411609.886714
Europe 572303.454048 651351.972673 751634.449078
Oceania 48048.350340 53877.556080 59620.376550
df.iloc[:, 6:13].groupby('County').sum()
I believe this is what you are looking for.我相信这就是您正在寻找的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.