如何将 pandas 中的 groupby().transform() 转换为 value_counts()？

Question

我正在处理带有商品价格的 pandas dataframe df1 。

  Item    Price  Minimum Most_Common_Price
0 Coffee  1      1       2
1 Coffee  2      1       2
2 Coffee  2      1       2
3 Tea     3      3       4
4 Tea     4      3       4
5 Tea     4      3       4

我创建Minimum使用：

df1["Minimum"] = df1.groupby(["Item"])['Price'].transform(min)

如何创建Most_Common_Price ？

df1["Minimum"] = df1.groupby(["Item"])['Price'].transform(value_counts()) # Doesn't work

目前，我使用多步骤方法：

for item in df1.Item.unique().tolist(): # Pseudocode
 df1 = df1[df1.Price == Item]           # Pseudocode
 df1.Price.value_counts().max()         # Pseudocode

这是矫枉过正。 必须有一种更简单的方法，最好是在一行中

如何将 pandas 中的 groupby().transform() 转换为 value_counts()？

Answer 1

您可以将groupby + transform与value_counts和idxmax 。

df['Most_Common_Price'] = (
    df.groupby('Item')['Price'].transform(lambda x: x.value_counts().idxmax()))

df

     Item  Price  Minimum  Most_Common_Price
0  Coffee      1        1                  2
1  Coffee      2        1                  2
2  Coffee      2        1                  2
3     Tea      3        3                  4
4     Tea      4        3                  4
5     Tea      4        3                  4

改进涉及使用pd.Series.map ，

# Thanks, Vaishali!
df['Item'] = (df['Item'].map(df.groupby('Item')['Price']
                        .agg(lambda x: x.value_counts().idxmax()))
df

     Item  Price  Minimum  Most_Common_Price
0  Coffee      1        1                  2
1  Coffee      2        1                  2
2  Coffee      2        1                  2
3     Tea      3        3                  4
4     Tea      4        3                  4
5     Tea      4        3                  4

Answer 2

一个很好的方法是使用pd.Series.mode ，如果你想要最常见的元素（即模式）。

In [32]: df
Out[32]:
     Item  Price  Minimum
0  Coffee      1        1
1  Coffee      2        1
2  Coffee      2        1
3     Tea      3        3
4     Tea      4        3
5     Tea      4        3

In [33]: df['Most_Common_Price'] = df.groupby(["Item"])['Price'].transform(pd.Series.mode)

In [34]: df
Out[34]:
     Item  Price  Minimum  Most_Common_Price
0  Coffee      1        1                  2
1  Coffee      2        1                  2
2  Coffee      2        1                  2
3     Tea      3        3                  4
4     Tea      4        3                  4
5     Tea      4        3                  4

正如@Wen所说， pd.Series.mode可以返回一个pd.Series值，所以只需抓住第一个：

Out[67]:
     Item  Price  Minimum
0  Coffee      1        1
1  Coffee      2        1
2  Coffee      2        1
3     Tea      3        3
4     Tea      4        3
5     Tea      4        3
6     Tea      3        3

In [68]: df[df.Item =='Tea'].Price.mode()
Out[68]:
0    3
1    4
dtype: int64

In [69]: df['Most_Common_Price'] = df.groupby(["Item"])['Price'].transform(lambda S: S.mode()[0])

In [70]: df
Out[70]:
     Item  Price  Minimum  Most_Common_Price
0  Coffee      1        1                  2
1  Coffee      2        1                  2
2  Coffee      2        1                  2
3     Tea      3        3                  3
4     Tea      4        3                  3
5     Tea      4        3                  3
6     Tea      3        3                  3

Answer 3


data_stack_try = [['Coffee',1],['Coffee',2],['Coffee',2],['Tea',3],['Tea',4],['Tea',4],['Milk', np.nan]]
df_stack_try = pd.DataFrame(data_stack_try, columns=["Item","Price"])
print("---Before Min---")
print(df_stack_try)
df_stack_try["Minimum"] = df_stack_try.groupby(["Item"])['Price'].transform(min)
print("---After Min----")
print(df_stack_try)
#Function written to take care of null values (Milk item is np.nan)
def mode_group(grp):
    try:
        return grp.mode()[0]
    except BaseException as e:
        print("Exception!!!")
df_stack_try["Most_Common_Price"] = df_stack_try.groupby('Item')['Price'].transform(lambda x: mode_group(x))
print("---After Mode----")
print(df_stack_try)


---Before Min---
     Item  Price
0  Coffee    1.0
1  Coffee    2.0
2  Coffee    2.0
3     Tea    3.0
4     Tea    4.0
5     Tea    4.0
6    Milk    NaN
---After Min----
     Item  Price  Minimum
0  Coffee    1.0      1.0
1  Coffee    2.0      1.0
2  Coffee    2.0      1.0
3     Tea    3.0      3.0
4     Tea    4.0      3.0
5     Tea    4.0      3.0
6    Milk    NaN      NaN
Exception!!!
---After Mode----
     Item  Price  Minimum  Most_Common_Price
0  Coffee    1.0      1.0                2.0
1  Coffee    2.0      1.0                2.0
2  Coffee    2.0      1.0                2.0
3     Tea    3.0      3.0                4.0
4     Tea    4.0      3.0                4.0
5     Tea    4.0      3.0                4.0
6    Milk    NaN      NaN                NaN

如何将 pandas 中的 groupby().transform() 转换为 value_counts()？

问题描述

3 个解决方案

解决方案1
5 已采纳 2017-12-20 04:36:31

解决方案2
4 2017-12-20 04:39:10

解决方案3
0 2023-01-08 00:13:14

如何将 pandas 中的 groupby().transform() 转换为 value_counts()？

问题描述

3 个解决方案

解决方案1 5 已采纳 2017-12-20 04:36:31

解决方案2 4 2017-12-20 04:39:10

解决方案3 0 2023-01-08 00:13:14

解决方案1
5 已采纳 2017-12-20 04:36:31

解决方案2
4 2017-12-20 04:39:10

解决方案3
0 2023-01-08 00:13:14