[英]How to average of a column based on a group by using python pandas?
I have input like : 我输入如下:
NAME Geoid Year QTR Index 'Abilene, TX 10180 1978 3 0 'Abilene, TX 10180 1978 4 0 'Abilene, TX 10180 1979 1 0 'Abilene, TX 10180 1979 2 0 'Decatur, IL 19500 1998 1 110.51 'Decatur, IL 19500 1998 2 110.48 'Decatur, IL 19500 1998 3 113.01 'Decatur, IL 19500 1998 4 114.16 'Fairbanks, AK 21820 1990 1 63.74 'Fairbanks, AK 21820 1990 2 70.68 'Fairbanks, AK 21820 1990 3 83.56 'Fairbanks, AK 21820 1990 4 83.95
The query that I want to convert to python from MYSQL is as this : 我想从MYSQL转换为python的查询如下:
SELECT geoid, name, YEAR, AVG(index)
FROM table_1
WHERE geoid>0
GROUP BY geoid, metro_name, YEAR;
The pythonic equivalent of AVG is mean is what i read online, but when I use mean it gives me a single value. AVG的pythonic等价物是我在线阅读的意思,但是当我使用它时,它给了我一个单一的价值。
pandas get column average/mean 大熊猫获得列平均值/平均值
But I want the output grouping the year and quarters like : 但我希望输出分组的年份和季度如下:
Name Geoid YEAR AVG(index) 'Abilene, TX 10180 1978 0 'Abilene, TX 10180 1979 0 'Decatur, IL 19500 1998 111.75 'Fairbanks, AK 21820 1990 74.9875
How to achieve this? 怎么做到这一点?
Use query
or boolean indexing
first for filtering and then groupby
with aggregate mean
: 首先使用query
或boolean indexing
进行过滤,然后使用聚合mean
进行groupby
:
df1 = df.query('Geoid > 0').groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825
df1 = df[df['Geoid'] > 0].groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.