簡體   English   中英

在 pandas 中分組和聚合

[英]group by and aggregate in pandas

    code_presentation   code_module score   id_student  id_assessment   date_submitted
0   2013J               AAA         78.0    11391        1752           18
1   2013J               AAA         70.0    11391        1800           22
2   2013J               AAA         72.0    31604        1752           17
3   2013J               AAA         69.0    31604        1800           26
.....

我需要計算提交的天數以及如何正確分組以獲得以下結果:

id_student  id_assessment date_submitted
11391       1752          1
            1800          1
31604       1752          1
            1800          1

... ETC

我嘗試:

analasys_grouped = analasys.groupby ( 'id_student', as_index = False)\
.agg({'id_assessment':'count', 'date_submitted': 'count'})
analasys_grouped 

但它工作不正常

如果我對您的理解正確,您想對按id_assessment分組的id_student應用value_counts() 嘗試:

assessment_count_per_student = df.groupby('id_student')['id_assessment'].value_counts()

print(assessment_count_per_student)

id_student  id_assessment
11391       1752             1
            1800             1
31604       1752             1
            1800             1
Name: id_assessment, dtype: int64

您需要將id_assessment傳遞到groupby語句中。

df.groupby(['id_student', 'id_assessment'])['date_submitted'].count()


id_student  id_assessment
11391       1752             1
            1800             1
31604       1752             1
            1800             1

在您的嘗試中,您僅按id_student分組,然后計算提交的評估和日期。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM