[英]Pandas groups the data and counts, including a value of 0
When I group and count the data, it will only show the number of existing values, which can be displayed intuitively, but I can't use it.我对数据进行分组统计的时候,只会显示已有值的个数,可以直观的显示出来,但是我不能用。 My data:
我的数据:
name group
aa a
aa b
bb a
cc b
dd a
dd b
My code:我的代码:
df=pd.read_csv(fp6)
a=df.groupby(['name','group'])['group'].size()
result:结果:
name group
aa a 1
b 1
bb a 1
cc b 1
dd a 1
b 1
If I extract its value to draw a chart, it will prompt me that I have missing parameters.如果我提取它的值来绘制图表,它会提示我缺少参数。 I want to show all 'group' values for each 'name'.
我想显示每个“名称”的所有“组”值。 like this:
像这样:
name group
aa a 1
b 1
bb a 1
b 0
cc a 0
b 1
dd a 1
b 1
Can someone teach me?有人可以教我吗?
Use pd.crosstab
:使用
pd.crosstab
:
>>> pd.crosstab(df['name'], df['group']).stack('group')
name group
aa a 1
b 1
bb a 1
b 0
cc a 0
b 1
dd a 1
b 1
dtype: int64
你可以使用这个:
df = df.groupby(['name','group']).count().unstack(fill_value=0).stack()
A really explicit way to ensure having all combinations is to use Categorical
data (conversion can by made using pandas.Categorical
).确保拥有所有组合的一种非常明确的方法是使用
Categorical
数据(可以使用pandas.Categorical
进行转换)。
NB.注意。 I added here a group 'c' in the categories that is non represented in the data to illustrate this point.
我在数据中未表示的类别中添加了一个组“c”来说明这一点。
df['name'] = pd.Categorical(df['name'], categories=['aa', 'bb', 'cc', 'dd'])
df['group'] = pd.Categorical(df['group'], categories=['a', 'b', 'c'])
df.groupby(['name','group'])['group'].size()
output:输出:
name group
aa a 1
b 1
c 0
bb a 1
b 0
c 0
cc a 0
b 1
c 0
dd a 1
b 1
c 0
Name: group, dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.