简体   繁体   English

熊猫groudby数据框,并获取每组平均值和最常见的值

[英]pandas groudby dataframe and get mean and most common value per group

I have a dataframe with 2 columns. 我有2列的数据框。

df=pd.DataFrame({'values':arrays,'ii':lin_index})

I want to group the values by the lin_index and get the mean per group and the most common value per group I try this 我想按lin_index对值进行分组,并获取每组的平均值和每组的最常见值,我尝试这样做

bii=df.groupby('ii').median()
bii2=df.groupby('ii').agg(lambda x:x.value_counts().index[0])
bii3=df.groupby('ii')['values'].agg(pd.Series.mode)

I wonder if bii2 and bii3 return the same values Then I want to return the mean and most common value to the original array 我想知道bii2和bii3是否返回相同的值,然后我想将均值和最常见的值返回到原始数组

bs=np.zeros((np.unique(array).shape[0],1))
bs[bii.index.values]=bii.values

Does this look good? 这样看起来好吗?

df looks like df看起来像

          values        ii
0            1.0  10446786
1            1.0  11316289
2            1.0  16416704
3            1.0  12151686
4            1.0  30312736
     ...       ...
93071038     3.0  28539525
93071039     3.0  19667948
93071040     3.0  22240849
93071041     3.0  22212513
93071042     3.0  41641943

[93071043 rows x 2 columns]

something like this maybe: 可能是这样的:

# get the mean
df.groupby(['ii']).mean()
# get the most frequent
df.groupby(['ii']).agg(pd.Series.mode)

your question seems similar to GroupBy pandas DataFrame and select most common value 您的问题似乎类似于GroupBy pandas DataFrame并选择最常见的值

this link might also be useful https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#computations-descriptive-stats 此链接可能也有用https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#computations-descriptive-stats

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM