pandas - 根据另一列中的每个唯一值计算DataFrame中值的出现次数

Question

Supposing that I have a DataFrame along the lines of: 假设我有一个DataFrame：

    term      score
0   this          0
1   that          1
2   the other     3
3   something     2
4   anything      1
5   the other     2
6   that          2
7   this          0
8   something     1

How would I go about counting up the instances in the score column by unique values in the term column? 如何通过term列中的唯一值来计算score列中的实例？ Producing a result like: 产生如下结果：

    term      score 0     score 1     score 2     score 3
0   this            2           0           0           0
1   that            0           1           1           0
2   the other       0           0           1           1
3   something       0           1           1           0
4   anything        0           1           0           0

Related questions I've read here include Python Pandas counting and summing specific conditions and COUNTIF in pandas python over multiple columns with multiple conditions , but neither seems to quite be what I'm looking to do. 我在这里读到的相关问题包括Python Pandas计算和总结特定条件，以及pandas python中的COUNTIF在具有多个条件的多个列上，但似乎都不是我想要做的。 pivot_table as mentioned at this question seems like it could be relevant but I'm impeded by lack of experience and the brevity of the pandas documentation. 在这个问题中提到的pivot_table似乎可能是相关的但是由于缺乏经验和熊猫文档的简洁性而受到阻碍。 Thanks for any suggestions. 谢谢你的任何建议。

Answer 1

Use groupby with size and reshape by unstack , last add_prefix : 使用groupby与size和重塑unstack ，最后add_prefix ：

df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')

Or use crosstab : 或者使用crosstab ：

df = pd.crosstab(df['term'],df['score']).add_prefix('score ')

Or pivot_table : 或者pivot_table ：

df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
        .add_prefix('score '))

print (df)
score      score 0  score 1  score 2  score 3
term                                         
anything         0        1        0        0
something        0        1        1        0
that             0        1        1        0
the other        0        0        1        1
this             2        0        0        0

Answer 2

You can also use, get_dummies , set_index , and sum with level parameter: 你也可以使用， get_dummies ， set_index ，并sum与level的参数：

(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')
   .sum(level=0)
   .reset_index())

Output: 输出：

        term  score 0  score 1  score 2  score 3
0       this        2        0        0        0
1       that        0        1        1        0
2  the other        0        0        1        1
3  something        0        1        1        0
4   anything        0        1        0        0

pandas - 根据另一列中的每个唯一值计算DataFrame中值的出现次数

问题描述

2 个解决方案

解决方案1
6 已采纳 2018-09-20 14:07:28

解决方案2
6 2018-09-20 14:14:43

pandas - 根据另一列中的每个唯一值计算DataFrame中值的出现次数

问题描述

2 个解决方案

解决方案1 6 已采纳 2018-09-20 14:07:28

解决方案2 6 2018-09-20 14:14:43

解决方案1
6 已采纳 2018-09-20 14:07:28

解决方案2
6 2018-09-20 14:14:43