重塑数据框并汇总值

Question

I have a data frame with three different columns, A, B and C. I have applied a group by command on Column A, B and C. I have also counted the no. 我有一个包含三个不同列（A，B和C）的数据框。我在A，B和C列上按命令应用了一个组。 of rows each group of three values possesses. 每组三个值都具有一组行。

Resulting data: 结果数据：

Now, I want to make 0 and 1 (cell values in column C) as columns themselves. 现在，我想将0和1（C列中的单元格值）作为列本身。 Also, I want to add them and display their sum in a separate column (alongside 0 and 1 columns). 另外，我想添加它们并在单独的列（以及0和1列）中显示它们的总和。 Desired output: 所需的输出：

A       B       Count0     Count1   Sum of Counts   Count1/Sum of Counts
1000    1000    38         538         567              538/567
1000    1001    9          90          99               90/99
1000    1002    8          16          24               16/24
1000    1003    2          10          12               10/12

(I am not an active Python user. I have searched a lot on this but can't seem to find the right words to search it) If I learn how to do the sum of counts 0 and 1 and display alongside other columns in the dataframe, I will do the division myself. （我不是Python的活跃用户。我对此进行了很多搜索，但似乎找不到合适的词来搜索它）如果我学会了如何进行计数0和1的总和并显示在数据框，我将自己进行划分。

Thanks in advance. 提前致谢。

Answer 1

Use SeriesGroupBy.value_counts or size with unstack : 使用SeriesGroupBy.value_counts或size与unstack ：

df = pd.DataFrame({
    'A': [1000] * 10,
    'B': [1000] * 2 + [1001] * 3 + [1002] * 5, 
    'C':[0,1] * 5
})
print (df)
      A     B  C
0  1000  1000  0
1  1000  1000  1
2  1000  1001  0
3  1000  1001  1
4  1000  1001  0
5  1000  1002  1
6  1000  1002  0
7  1000  1002  1
8  1000  1002  0
9  1000  1002  1

df = df.groupby(['A','B'])['C'].value_counts().unstack(fill_value=0).reset_index()

#another solution
#df = pd.crosstab([df['A'], df['B']], df['C']).reset_index()
#solution 2
#df = df.groupby(['A','B','C']).size().unstack(fill_value=0).reset_index()

print (df)
C     A     B  0  1
0  1000  1000  1  1
1  1000  1001  2  1
2  1000  1002  2  3

And then sum and divide: 然后求和除法：

df = df.rename(columns={0:'Count0',1:'Count1'})
df['Sum of Counts'] = df['Count0'] + df['Count1']
df['Count1/Sum of Counts'] = df['Count1'] / df['Sum of Counts']
print (df)
C     A     B  Count0  Count1  Sum of Counts  Count1/Sum of Counts
0  1000  1000       1       1              2              0.500000
1  1000  1001       2       1              3              0.333333
2  1000  1002       2       3              5              0.600000

Answer 2

Try: 尝试：

df1 = df.pivot_table(values='counts', index=['A', 'B'], columns=['C'], aggfunc='sum', fill_value=None, margins=True, dropna=True, margins_name='Sum of Counts').reset_index()
df1 = df1.rename(columns={0:'Count0',1:'Count1'})
df1['Count1/Sum of Counts'] = df1['Count1'] / df1['Sum of Counts']

You can do a reset_index() to structure it better. 您可以执行reset_index()使其结构更好。 Also, Count1/Sum of Counts is just df['Count1'] / df['Sum of Counts'] 另外， Count1/Sum of Counts只是df['Count1'] / df['Sum of Counts']

重塑数据框并汇总值

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-11-05 11:55:31

解决方案2
0 2018-11-05 11:53:32

重塑数据框并汇总值

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-11-05 11:55:31

解决方案2 0 2018-11-05 11:53:32

解决方案1
2 已采纳 2018-11-05 11:55:31

解决方案2
0 2018-11-05 11:53:32