[英]How to add new column group after using pivot pandas?
I'm trying to create a new column group consisting of 3 sub-columns after using pivot
on a dataframe, but the result is only one column.在pivot
后,我试图创建一个由 3 个子列组成的新列组,但结果只有一列。
Let's say I have the following dataframe that I pivot:假设我有以下 dataframe 我 pivot:
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
'two'],
'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
'baz': [1, 2, 3, 4, 5, 6],
'zoo': [1, 2, 3, 4, 5, 6]})
df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
Now I want an extra column group that is the sum of the two value columns baz and zoo .现在我想要一个额外的列组,它是baz和zoo这两个值列的总和。
My output:我的 output:
df.loc[:, "baz+zoo"] = df.loc[:,'baz'] + df.loc[:,'baz']
The desired output:所需的 output:
I know that performing the sum and then concatenating will do the trick, but I was hoping for a neater solution.我知道执行求和然后连接可以解决问题,但我希望有一个更简洁的解决方案。
I think if many rows or mainly many columns is better/faster create new DataFrame
and add first level of MultiIndex
by MultiIndex.from_product
and add to original by DataFrame.join
:我认为如果多行或主要是多列更好/更快创建新的DataFrame
并通过DataFrame.join
添加第一级MultiIndex
并通过MultiIndex.from_product
添加到原始:
df1 = df.loc[:,'baz'] + df.loc[:,'zoo']
df1.columns = pd.MultiIndex.from_product([['baz+zoo'], df1.columns])
print (df1)
baz+zoo
A B C
foo
one 2 4 6
two 8 10 12
df = df.join(df1)
print (df)
baz zoo baz+zoo
bar A B C A B C A B C
foo
one 1 2 3 1 2 3 2 4 6
two 4 5 6 4 5 6 8 10 12
Another solution is loop by second levels and select MultiIndex
by tuples, but if large DataFrame performance should be worse, the best test with real data:另一种解决方案是通过第二级循环和 select MultiIndex
通过元组进行循环,但如果 DataFrame 性能较大,则性能应该更差,最好用真实数据测试:
for x in df.columns.levels[1]:
df[('baz+zoo', x)] = df[('baz', x)] + df[('zoo', x)]
print (df)
baz zoo baz+zoo
bar A B C A B C A B C
foo
one 1 2 3 1 2 3 2 4 6
two 4 5 6 4 5 6 8 10 12
I was able to do it this way too.我也可以这样做。 I'm not sure I understand the theory, but...我不确定我是否理解这个理论,但是...
df['baz+zoo'] = df['baz']+df['zoo']
df.pivot(index='foo', columns='bar', values=['baz','zoo','baz+zoo'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.