简体   繁体   English

Python(熊猫) - 基于一列对多列求和

[英]Python (pandas) - sum multiple columns based on one column

I have a dataframe of Covid-19 deaths by country.我有一个 dataframe 按国家/地区划分的 Covid-19 死亡人数。 Countries are identified in the Country column.国家在Country列中标识。 Sub-national classification is based on the Province column.次国家分类基于Province列。

I want to generate a dataframe which sums all columns based on the value in the Country column (except the first 2, which are geographical data).我想生成一个 dataframe ,它根据Country列中的值对所有列进行求和(除了前 2 个,它们是地理数据)。 In short, for each date, I want to compress the observations for all provinces of a country such that I get a single number for each country.简而言之,对于每个日期,我想压缩一个国家所有省份的观察结果,以便我得到每个国家的一个数字。

Right now, I am able to do that for a single date:现在,我可以在一个日期内做到这一点:

import pandas as pd

url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID- 
19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv'
raw = pd.read_csv(url)
del raw['Lat']
del raw['Long']
raw.rename({'Country/Region': 'Country', 'Province/State': 'Province'}, axis=1, inplace=True)

raw2 = raw.groupby('Country')['6/29/20'].sum()

How can I achieve this for all dates?我怎样才能在所有日期都做到这一点?

You can use iloc :您可以使用iloc

raw2 = raw.iloc[:,4:].groupby(raw.Country).sum()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM