python 中 pivot 表中的总值

Question

My original dataframe is similar to the one below:我原来的 dataframe 与下面的类似：

df= pd.DataFrame({'Variation' : ['A']*5 + ['B']*3 + ['A']*4, 
                  'id': [11]*4 + [12] + [15]*2 + [17] + [20]*4,
                 'steps' : ['start','step1','step2','end','end','step1','step2','step1','start','step1','step2','end']})

I wanted to create a pivot table from this dataframe for which I have used the below mentioned code:我想从这个 dataframe 创建一个 pivot 表，我使用了下面提到的代码：

df1=df.pivot_table(index=['Variation'], columns=['steps'], 
                          values='id', aggfunc='count', fill_value=0)

However, I also wanted to look at the total distinct count of the id's as well.但是，我还想查看 id 的总不同计数。 Can someone please let me know how to achieve this?有人可以让我知道如何实现这一目标吗？ My expected output should be:我预期的 output 应该是：

| Variation | Total id | Total start | Total step1 | Total step2 | Total end |
|-----------|----------|-------------|-------------|-------------|-----------|
| A         | 3        | 2           | 2           | 2           | 3         |
| B         | 2        | 0           | 2           | 1           | 0         |

Answer 1

Use SeriesGroupBy.nunique :使用SeriesGroupBy.nunique ：

df1 = df1.join(df.groupby('Variation')['id'].nunique().rename('Total id'))
print(df1)
           end  start  step1  step2  Total id
Variation                                    
A            3      2      2      2         3
B            0      0      2      1         2

If need column after Variation :如果需要Variation之后的列：

c = ['id'] + df['steps'].unique().tolist()
df1 = (df1.join(df.groupby('Variation')['id'].nunique())
          .reindex(columns=c)
          .add_prefix('Total ')
          .reset_index()
          .rename_axis(None, axis=1))

print(df1)
  Variation  Total id  Total start  Total step1  Total step2  Total end
0         A         3            2            2            2          3
1         B         2            0            2            1          0

python 中 pivot 表中的总值

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-08-15 10:19:56

python 中 pivot 表中的总值

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-08-15 10:19:56

解决方案1
3 已采纳 2020-08-15 10:19:56