熊猫：将摘要信息添加到groupby框架中的新列中

Question

Working on a class assignment. 进行课堂作业。

Our current dataset has information that looks like: 我们当前的数据集具有如下信息：

    Item ID      Item Name                                  Price
0   108          Extraction, Quickblade Of Trembling Hands  3.53
1   143          Frenzied Scimitar                          1.56
2   92           Final Critic                               4.88
3   100          Blindscythe                                3.27
4   131          Fury                                       1.44

We were asked to group by two values, which I've done. 我们被要求按两个值分组，这已经完成。

item_df = popcolumns_df.groupby(["Item ID","Item Name"])

I'm having issues though, trying to append the groupby functions to this dataframe. 我遇到了问题，试图将groupby函数附加到此数据帧。 For instance, when I run count, the count replaces the price. 例如，当我运行count时，count取代了价格。 Attempt one just replaced all the data in the price column with the counts. 尝试将价格列中的所有数据替换为计数。

item_counts = item_df.count().reset_index()

Output: 输出：

    Item ID     Item Name           Price
0   0           Splinter             4
1   1           Crucifer             3
2   2           Verdict              6
3   3           Phantomlight         6
4   4           Bloodlord's Fetish   5

Attempt 2 did the same: 尝试2进行了相同的操作：

item_counts = item_df.size().reset_index(name="Counts")

My desired output is: 我想要的输出是：

     Item ID    Item Name                Price    Count   Revenue
0    108        Extraction, Quickblade   3.53     12      42.36
1    143        Frenzied Scimitar        1.56     3        4.68
2    92         Final Critic             4.88     2        9.76
3    100        Blindscythe              3.27     1        3.27
4    131        Fury                     1.44     5        7.20

I would likely just use a sum on the groups to get the revenue. 我可能只对各组使用总和来获得收入。 I've been stumped on this for a couple of hours, so any help would be greatly appreciated! 我已经为此困扰了几个小时，所以任何帮助将不胜感激！

Answer 1

If the prices for any two equivalent items is the same, then you could include "Price" in your grouping, and then compute the group sizes : 如果任何两个等效项目的价格相同，则可以在分组中包含"Price" ，然后计算分组大小：

summary = popcolumns_df \
    .groupby(["Item ID", "Item Name", "Price"]) \
    .size() \
    .rename("Count") \
    .reset_index()

summary['Revenue'] = summary['Count'] * summary['Price']

The call to pd.Series.rename makes the column in the final dataframe be named "Count" . 对pd.Series.rename的调用使最终数据pd.Series.rename的列命名为"Count" 。

Answer 2

I think you're looking for the transform method of the groupby. 我认为您正在寻找groupby的transform方法。 That returns aggregate metrics at the original level of your data. 这将返回原始数据级别的汇总指标。

For example, to create a new column in your original data for the count of some grouping: 例如，要在原始数据中创建一个新列以用于某些分组的计数：

df['group_level_count'] = df.groupby(['foo', 'bar']).transform('count')  # or 'size' I think, depending whether you want to count NaNs

Related: * How to count number of rows per group (and other statistics) in pandas group by? 相关：* 如何计算熊猫分组依据中每组的行数（以及其他统计信息）？ * https://pandas.pydata.org/pandas-docs/stable/groupby.html#transformation * https://pandas.pydata.org/pandas-docs/stable/groupby.html#transformation

熊猫：将摘要信息添加到groupby框架中的新列中

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-12-17 03:29:28

解决方案2
0 2018-12-30 19:52:29

熊猫：将摘要信息添加到groupby框架中的新列中

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-12-17 03:29:28

解决方案2 0 2018-12-30 19:52:29

解决方案1
0 已采纳 2018-12-17 03:29:28

解决方案2
0 2018-12-30 19:52:29