简体   繁体   English

Pandas:groupby索引然后使用函数填充数据帧

[英]Pandas: groupby index and then fill the dataframe using function

I have a dataframe which is like this: 我有一个数据帧,如下所示:

df = pd.DataFrame({'pop1': ['1000', '2000','3000','4000'],
                   'pop2': ['2000', '3000','2000','2000']},
                  index=['female','female','male','male'])

How can I create a 2*2 DataFrame that gives the percent of the total population in a given year (the 2 columns) for a given gender (the two rows)? 如何创建一个2 * 2 DataFrame,它给出给定年份(两列)给定年份(2列)中总人口的百分比?

You need first cast string to int by astype , then groupby with aggregating sum and divide by div by sum . 你需要先投stringintastype ,然后groupby与聚集sum由和分divsum Last multiple 100 : 最后一个100

df = df.astype(int)

a = df.groupby(level=0).sum()
print (a)
        pop1  pop2
female  3000  5000
male    7000  4000

b = df.sum()
print (b)
pop1    10000
pop2     9000
dtype: int64

print (a.div(b).mul(100))
        pop1       pop2
female  30.0  55.555556
male    70.0  44.444444

It is same as: 它与:

df = df.astype(int)
print (df.groupby(level=0).sum().div(df.sum()).mul(100))
        pop1       pop2
female  30.0  55.555556
male    70.0  44.444444

Here is a one liner: 这是一个班轮:

(df.astype(int) / df.astype(int).sum()).groupby(level=0).sum() * 100

It is a little prettier if you are already dealing with integers: 如果你已经在处理整数,那就更漂亮了:

df = df.astype(int)
(df / df.sum()).groupby(level=0).sum() * 100

Put into words, after you convert the data into integers, you then divide each number by the total size of the relevant population, sum up those weights for each gender, and then multiply by 100 so the result looks like a percentage. 换句话说,在将数据转换为整数后,然后将每个数字除以相关总体的总大小,将每个性别的权重相加,然后乘以100,使结果看起来像一个百分比。

v = df.values.astype(int)
pd.DataFrame(
    v / v.sum(0) * 100, df.index, df.columns
).groupby(level=0).sum()

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM