[英]Adding an Average Column to a Pandas Multiindex Dataframe
I have a dataframe df
我有一个数据帧
df
first bar baz
second one two one two
A 0.487880 -0.487661 -1.030176 0.100813
B 0.267913 1.918923 0.132791 0.178503
C 1.550526 -0.312235 -1.177689 -0.081596
I'd like to add a average columns and then move the average to the front 我想添加一个平均列,然后将平均值移到前面
df['Average'] = df.mean(level='second', axis='columns') #ERROR HERE
cols = df.columns.tolist()
df = df[[cols[-1]] + cols[:-1]]
I get the error: 我收到错误:
ValueError: Wrong number of items passed 2, placement implies 1
Maybe, I could add each column df['Average', 'One'] = ...
in the mean one at a time but that seems silly especially as the real life index is more complicated. 也许,我可以添加每个列
df['Average', 'One'] = ...
一次一个,但这似乎很愚蠢,特别是因为现实生活指数更复杂。
Edit: ( Frame Generation ) 编辑:( 帧生成 )
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
I'm not sure on your target output. 我不确定你的目标输出。 Something like this?
像这样的东西?
df2 = df.mean(level='second', axis='columns')
df2.columns = pd.MultiIndex.from_tuples([('mean', col) for col in df2])
>>> df2
mean
one two
A -0.271148 -0.193424
B 0.200352 1.048713
C 0.186419 -0.196915
>>> pd.concat([df2, df], axis=1)
mean bar baz
one two one two one two
A -0.271148 -0.193424 0.487880 -0.487661 -1.030176 0.100813
B 0.200352 1.048713 0.267913 1.918923 0.132791 0.178503
C 0.186419 -0.196915 1.550526 -0.312235 -1.177689 -0.081596
You are getting the error because your mean
operation results in a dataframe (with two columns in this case). 您收到错误是因为您的
mean
操作导致数据帧(在这种情况下有两列)。 You are then trying to assign this result into one column in the original dataframe. 然后,您尝试将此结果分配到原始数据框中的一列中。
pandas.concat
df.join(pd.concat([df.mean(level='second', axis='columns')], axis=1, keys=['Average']))
first bar baz Average
second one two one two one two
A 0.255301 0.286846 1.027024 -0.060594 0.641162 0.113126
B -0.608509 -2.291201 0.675753 -0.416156 0.033622 -1.353679
C 2.714254 -1.330621 -0.099545 0.616833 1.307354 -0.356894
stack
/ unstack
stack
/ unstack
Not necessarily efficient, but neat 不一定高效,但整洁
df.stack().assign(Average=df.mean(level='second', axis='columns').stack()).unstack()
first bar baz Average
second one two one two one two
A 0.255301 0.286846 1.027024 -0.060594 0.641162 0.113126
B -0.608509 -2.291201 0.675753 -0.416156 0.033622 -1.353679
C 2.714254 -1.330621 -0.099545 0.616833 1.307354 -0.356894
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.