[英]How to find the median for dataframe of population with columns of age and count?
df looks like this: df 看起来像这样:
age population
0 20 2
1 21 3
2 22 2
3 23 5
4 24 7
df = pd.DataFrame({ 'age': [20, 21, 22, 23, 24], 'population': [2, 3, 2, 5, 7]})
and I'd like to calculate the median age of the total population.我想计算总人口的中位年龄。 Is there a simple way to do this?
有没有一种简单的方法可以做到这一点?
Got average like this, but I need the median:得到这样的平均值,但我需要中位数:
df['years'] = df['age'] * df['population']
average_age= (df['years'].sum()/df['population'].sum())
Multiplying two pandas Series is different than multiplying lists - you're not copying each value N times, you're performing element-wise multiplication.将两个 pandas 系列相乘不同于将列表相乘 - 您不是将每个值复制 N 次,而是在执行逐元素乘法。
Use pd.Series.repeat
to repeat each element N times, and then use the .median
method to calculate the median of the resulting pandas Series:使用
pd.Series.repeat
将每个元素重复 N 次,然后使用.median
方法计算得到的 pandas 系列的中位数:
df = pd.DataFrame({ 'age': [20, 21, 22, 23, 24], 'population': [2, 3, 2, 5, 7]})
m = df['age'].repeat(df['population']).median()
print(m) # output: 23.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.