有没有办法在使用 pandas 按第三列中的值分组时将两列中的值相乘？

Question

So I'm trying to avoid using a loop while calculating the mean of the weighted grades in each of these courses.因此，在计算每门课程的加权成绩平均值时，我试图避免使用循环。

I just can't wrap my head around what to do.我只是不知道该怎么做。 I assume I can use groupby and perform the appropriate calcualtions?我假设我可以使用 groupby 并执行适当的计算？

This is the dataframe:这是 dataframe：

data = 

mark  weight  course_id
78      10          1
87      40          1
15      50          1
78      90          3
40      10          3

This is the desired result:这是期望的结果：

result=

course_id  course_average
1            50.1
3            74.2

Answer 1

This is one way to go about it:这是 go 关于它的一种方法：

(df.assign(course_average=df.mark * df.weight)
   .groupby("course_id")
   .pipe(lambda x: x.course_average.sum().div(x.weight.sum()))
   .reset_index(name="course_average"))


    course_id   course_average
0      1         50.1
1      3         74.2

Answer 2

If the numbers don't always add up to 100 for each group, then you can calculate the proportion of weight for each row of each group and multiply by mark .如果每组的数字加起来并不总是 100，那么您可以计算每组每行的weight比例并乘以mark 。

(data.assign(wa = data['mark'] * data['weight'] / 
             data.groupby('course_id')['weight'].transform('sum'))
     .groupby('course_id')['wa'].sum())
Out[1]: 
course_id
1    50.1
3    74.2
Name: wa, dtype: float64

If the weights do add up to 100 for each group, then the calculation is easier:如果每个组的权重加起来为 100，则计算更容易：

data.assign(wa = data['mark'] * data['weight'] / 100).groupby('course_id')['wa'].sum()

Out[2]: 
course_id
1    50.1
3    74.2
Name: wa, dtype: float64

Answer 3

You can do this with a simple 1 liner using groupby and lambda for weighted average as follows -您可以使用groupby和lambda使用简单的 1 班轮进行加权平均，如下所示 -

df.groupby(['course_id']).apply(lambda x: sum(x['mark']*x['weight'])/sum(x['weight']))

course_id
1    50.1
3    74.2
dtype: float64

有没有办法在使用 pandas 按第三列中的值分组时将两列中的值相乘？

问题描述

3 个解决方案

解决方案1
1 2020-12-02 01:03:11

解决方案2
1 已采纳 2020-12-02 01:04:02

解决方案3
1 2020-12-02 01:17:46

有没有办法在使用 pandas 按第三列中的值分组时将两列中的值相乘？

问题描述

3 个解决方案

解决方案1 1 2020-12-02 01:03:11

解决方案2 1 已采纳 2020-12-02 01:04:02

解决方案3 1 2020-12-02 01:17:46

解决方案1
1 2020-12-02 01:03:11

解决方案2
1 已采纳 2020-12-02 01:04:02

解决方案3
1 2020-12-02 01:17:46