[英]Pandas: groupby index and then fill the dataframe using function
I have a dataframe which is like this: 我有一个数据帧,如下所示:
df = pd.DataFrame({'pop1': ['1000', '2000','3000','4000'],
'pop2': ['2000', '3000','2000','2000']},
index=['female','female','male','male'])
How can I create a 2*2 DataFrame that gives the percent of the total population in a given year (the 2 columns) for a given gender (the two rows)? 如何创建一个2 * 2 DataFrame,它给出给定年份(两列)给定年份(2列)中总人口的百分比?
You need first cast string
to int
by astype
, then groupby
with aggregating sum
and divide by div
by sum
. 你需要先投
string
来int
由astype
,然后groupby
与聚集sum
由和分div
的sum
。 Last multiple 100
: 最后一个
100
:
df = df.astype(int)
a = df.groupby(level=0).sum()
print (a)
pop1 pop2
female 3000 5000
male 7000 4000
b = df.sum()
print (b)
pop1 10000
pop2 9000
dtype: int64
print (a.div(b).mul(100))
pop1 pop2
female 30.0 55.555556
male 70.0 44.444444
It is same as: 它与:
df = df.astype(int)
print (df.groupby(level=0).sum().div(df.sum()).mul(100))
pop1 pop2
female 30.0 55.555556
male 70.0 44.444444
Here is a one liner: 这是一个班轮:
(df.astype(int) / df.astype(int).sum()).groupby(level=0).sum() * 100
It is a little prettier if you are already dealing with integers: 如果你已经在处理整数,那就更漂亮了:
df = df.astype(int)
(df / df.sum()).groupby(level=0).sum() * 100
Put into words, after you convert the data into integers, you then divide each number by the total size of the relevant population, sum up those weights for each gender, and then multiply by 100 so the result looks like a percentage. 换句话说,在将数据转换为整数后,然后将每个数字除以相关总体的总大小,将每个性别的权重相加,然后乘以100,使结果看起来像一个百分比。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.