pandas value_counts 包括 groupby 之前的所有值

Question

假設我有以下數據框：

df = pd.DataFrame([['a',1, -1], ['a', 1, -1], ['b', 0, -1], ['c', -1, -1]] ,columns = ['col1', 'col2', 'col3'])
df
    col1    col2    col3
0   a       1       -1
1   a       1       -1
2   b       0       -1
3   c       -1      -1

現在我想按列對 df 進行分組，並且對於每個列，分別計算列col1值的出現次數。

groupby_df = df.groupby('col1') 
for a,b in groupby_df:
    print("{0} -> \n{1}".format(a, b['col1'].value_counts().sort_index()))

我得到：

a -> 
a    2
Name: col1, dtype: int64
b -> 
b    1
Name: col1, dtype: int64
c -> 
c    1
Name: col1, dtype: int64

但是我想單獨統計出現次數，仍然包括所有的列值，如下：

a -> 
a    2
b    0
c    0
Name: col1, dtype: int64
b -> 
a    0
b    1
c    0
Name: col1, dtype: int64
c -> 
a    0
b    0
c    1
Name: col1, dtype: int64

任何幫助將不勝感激！

Answer 1

嘗試使用.reindex() ：

import pandas as pd

df = pd.DataFrame([['a',1, -1], ['a', 1, -1], ['b', 0, -1], ['c', -1, -1]] ,columns = ['col1', 'col2', 'col3'])

# Create index using unique values of col1.

uniques = pd.Index(df['col1'].unique())

# Group.

groupby_df = df.groupby('col1')

# Use reindex to assign and autoamtically align the value counts with the index.

for a, b in groupby_df:
    print(b['col1'].value_counts().sort_index().reindex(uniques, fill_value = 0))

給出：

a    2
b    0
c    0
Name: col1, dtype: int64
a    0
b    1
c    0
Name: col1, dtype: int64
a    0
b    0
c    1
Name: col1, dtype: int64

pandas value_counts 包括 groupby 之前的所有值

問題描述

1 個解決方案

解決方案1
1 已采納 2018-09-26 22:36:03

pandas value_counts 包括 groupby 之前的所有值

問題描述

1 個解決方案

解決方案1 1 已采納 2018-09-26 22:36:03

解決方案1
1 已采納 2018-09-26 22:36:03