简体   繁体   English

使用 pivot pandas 后如何添加新的列组?

[英]How to add new column group after using pivot pandas?

I'm trying to create a new column group consisting of 3 sub-columns after using pivot on a dataframe, but the result is only one column.pivot后,我试图创建一个由 3 个子列组成的新列组,但结果只有一列。

Let's say I have the following dataframe that I pivot:假设我有以下 dataframe 我 pivot:

df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
                           'two'],
                   'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'baz': [1, 2, 3, 4, 5, 6],
                   'zoo': [1, 2, 3, 4, 5, 6]})
df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])

Now I want an extra column group that is the sum of the two value columns baz and zoo .现在我想要一个额外的列组,它是bazzoo这两个值列的总和。

My output:我的 output:

df.loc[:, "baz+zoo"] = df.loc[:,'baz'] + df.loc[:,'baz']

我的输出

The desired output:所需的 output:

在此处输入图像描述

I know that performing the sum and then concatenating will do the trick, but I was hoping for a neater solution.我知道执行求和然后连接可以解决问题,但我希望有一个更简洁的解决方案。

I think if many rows or mainly many columns is better/faster create new DataFrame and add first level of MultiIndex by MultiIndex.from_product and add to original by DataFrame.join :我认为如果多行或主要是多列更好/更快创建新的DataFrame并通过DataFrame.join添加第一级MultiIndex并通过MultiIndex.from_product添加到原始:

df1 = df.loc[:,'baz'] + df.loc[:,'zoo']
df1.columns = pd.MultiIndex.from_product([['baz+zoo'], df1.columns])
print (df1)
   baz+zoo        
          A   B   C
foo                
one       2   4   6
two       8  10  12

df = df.join(df1)
print (df)
    baz       zoo       baz+zoo        
bar   A  B  C   A  B  C       A   B   C
foo                                    
one   1  2  3   1  2  3       2   4   6
two   4  5  6   4  5  6       8  10  12

Another solution is loop by second levels and select MultiIndex by tuples, but if large DataFrame performance should be worse, the best test with real data:另一种解决方案是通过第二级循环和 select MultiIndex通过元组进行循环,但如果 DataFrame 性能较大,则性能应该更差,最好用真实数据测试:

for x in df.columns.levels[1]:
    df[('baz+zoo', x)] = df[('baz', x)] + df[('zoo', x)]
print (df)
    baz       zoo       baz+zoo        
bar   A  B  C   A  B  C       A   B   C
foo                                    
one   1  2  3   1  2  3       2   4   6
two   4  5  6   4  5  6       8  10  12

I was able to do it this way too.我也可以这样做。 I'm not sure I understand the theory, but...我不确定我是否理解这个理论,但是...

df['baz+zoo'] = df['baz']+df['zoo']
df.pivot(index='foo', columns='bar', values=['baz','zoo','baz+zoo'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM