简体   繁体   English

如何使用python pandas基于组平均列?

[英]How to average of a column based on a group by using python pandas?

I have input like : 我输入如下:

NAME            Geoid    Year   QTR Index 
'Abilene, TX    10180   1978    3   0
'Abilene, TX    10180   1978    4   0
'Abilene, TX    10180   1979    1   0
'Abilene, TX    10180   1979    2   0
'Decatur, IL    19500   1998    1   110.51
'Decatur, IL    19500   1998    2   110.48
'Decatur, IL    19500   1998    3   113.01
'Decatur, IL    19500   1998    4   114.16
'Fairbanks, AK  21820   1990    1   63.74
'Fairbanks, AK  21820   1990    2   70.68
'Fairbanks, AK  21820   1990    3   83.56
'Fairbanks, AK  21820   1990    4   83.95

The query that I want to convert to python from MYSQL is as this : 我想从MYSQL转换为python的查询如下:

   SELECT  geoid, name, YEAR, AVG(index)
   FROM table_1
   WHERE geoid>0
   GROUP BY geoid, metro_name, YEAR;

The pythonic equivalent of AVG is mean is what i read online, but when I use mean it gives me a single value. AVG的pythonic等价物是我在线阅读的意思,但是当我使用它时,它给了我一个单一的价值。

pandas get column average/mean 大熊猫获得列平均值/平均值

But I want the output grouping the year and quarters like : 但我希望输出分组的年份和季度如下:

Name            Geoid   YEAR    AVG(index)
'Abilene, TX    10180   1978    0
'Abilene, TX    10180   1979    0
'Decatur, IL    19500   1998    111.75
'Fairbanks, AK  21820   1990    74.9875

How to achieve this? 怎么做到这一点?

Use query or boolean indexing first for filtering and then groupby with aggregate mean : 首先使用queryboolean indexing进行过滤,然后使用聚合mean进行groupby

df1 = df.query('Geoid > 0').groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
             NAME  Geoid  Year     Index
0    'Abilene, TX  10180  1978    0.0000
1    'Abilene, TX  10180  1979    0.0000
2    'Decatur, IL  19500  1998  112.0400
3  'Fairbanks, AK  21820  1990   75.4825

df1 = df[df['Geoid'] > 0].groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
             NAME  Geoid  Year     Index
0    'Abilene, TX  10180  1978    0.0000
1    'Abilene, TX  10180  1979    0.0000
2    'Decatur, IL  19500  1998  112.0400
3  'Fairbanks, AK  21820  1990   75.4825

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Python Pandas 对组的平均值进行排序? - How do I sort the average of a group using Python Pandas? 使用 pandas 如何根据日期查找列中某些元素的平均值? - Using pandas how to find the average of certain elements in a column based on their date? Python Pandas平均根据条件进入新列 - Python Pandas average based on condition into new column Python Pandas:按分组和平均分组? - Python Pandas : group by in group by and average? 如何根据python(pandas,jupyter)中的另一列值获取一列的平均值 - how to get the average of values for one column based on another column value in python (pandas, jupyter) python如何按求和和平均列分组? - How group by sum and average column in python? 如何在 Python 中使用 Pandas 按列分组 - How to group by a column with Pandas in Python Python (pandas) - 如何对一列中的值进行分组,然后根据另一列中的值删除或保留该组 - Python (pandas) - How to group values in one column and then delete or keep that group based on values in another column Python/Pandas:根据不同列的平均值将 NaN 更改为值 - Python/Pandas: change NaN to valeues based on average from different column 熊猫:分组加权平均,如何控制输出列的名称? - Pandas: Group weighted average, how to control the name of the output column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM