简体   繁体   English

Pandas 对数据和计数进行分组,包括值 0

[英]Pandas groups the data and counts, including a value of 0

When I group and count the data, it will only show the number of existing values, which can be displayed intuitively, but I can't use it.我对数据进行分组统计的时候,只会显示已有值的个数,可以直观的显示出来,但是我不能用。 My data:我的数据:

name    group
aa  a
aa  b
bb  a
cc  b
dd  a
dd  b

My code:我的代码:

df=pd.read_csv(fp6)
a=df.groupby(['name','group'])['group'].size()

result:结果:

name  group
aa    a        1
      b        1
bb    a        1
cc    b        1
dd    a        1
      b        1

If I extract its value to draw a chart, it will prompt me that I have missing parameters.如果我提取它的值来绘制图表,它会提示我缺少参数。 I want to show all 'group' values for each 'name'.我想显示每个“名称”的所有“组”值。 like this:像这样:

name  group
aa    a        1
      b        1
bb    a        1
      b        0
cc    a        0
      b        1
dd    a        1
      b        1

Can someone teach me?有人可以教我吗?

Use pd.crosstab :使用pd.crosstab

>>> pd.crosstab(df['name'], df['group']).stack('group')
name  group
aa    a        1
      b        1
bb    a        1
      b        0
cc    a        0
      b        1
dd    a        1
      b        1
dtype: int64

你可以使用这个:

df = df.groupby(['name','group']).count().unstack(fill_value=0).stack()

A really explicit way to ensure having all combinations is to use Categorical data (conversion can by made using pandas.Categorical ).确保拥有所有组合的一种非常明确的方法是使用Categorical数据(可以使用pandas.Categorical进行转换)。

NB.注意。 I added here a group 'c' in the categories that is non represented in the data to illustrate this point.我在数据中未表示的类别中添加了一个组“c”来说明这一点。

df['name'] = pd.Categorical(df['name'], categories=['aa', 'bb', 'cc', 'dd'])
df['group'] = pd.Categorical(df['group'], categories=['a', 'b', 'c'])

df.groupby(['name','group'])['group'].size()

output:输出:

name  group
aa    a        1
      b        1
      c        0
bb    a        1
      b        0
      c        0
cc    a        0
      b        1
      c        0
dd    a        1
      b        1
      c        0
Name: group, dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM