简体   繁体   English

在多索引数据框中插入列

[英]Insert column in multiindex data frame

I have a Multi-Index data frame and I want to add a column in level 1 and have it grouped in the appropriate level 0 column. 我有一个多索引数据框,我想在级别1中添加一个列,并将其分组到适当的级别0列。 When I assign the new column, it appends it to the end of the df. 当我分配新列时,它会将其附加到df的末尾。

In [28]: df
Out[28]: 
first        qux                 bar                 foo          
second       one       two       one       two       one       two
A      -0.563477 -0.032948 -0.131031  1.110537 -0.541374  0.760088
B      -1.767642 -1.305016 -0.786291 -0.396981  1.983372 -0.106018
C      -0.471136  0.616730  0.019877  0.910230  0.352304 -0.361370

In [29]: df['qux','three'] = [1,2,3]

In [30]: df
Out[30]: 
first        qux                 bar                 foo             qux
second       one       two       one       two       one       two three
A      -0.563477 -0.032948 -0.131031  1.110537 -0.541374  0.760088     1
B      -1.767642 -1.305016 -0.786291 -0.396981  1.983372 -0.106018     2
C      -0.471136  0.616730  0.019877  0.910230  0.352304 -0.361370     3

What I WANT it to look like is 我希望它看起来像是什么

first        qux                 bar                 foo           
second       one       two three      one       two       one       two
A      -0.563477 -0.032948     1 -0.131031  1.110537 -0.541374  0.760088
B      -1.767642 -1.305016     2 -0.786291 -0.396981  1.983372 -0.106018 
C      -0.471136  0.616730     3  0.019877  0.910230  0.352304 -0.361370

I tried df.sort_index(axis=1,level=0) , which at least grouped the qux 's together, but it alphabetized my level 0 headings. 我尝试了df.sort_index(axis=1,level=0) ,它至少将qux组合在一起,但它将我的0级标题按字母顺序排列。 How can I get it to group the common column names without alphabetizing them? 如何在不按字母顺序排列公用列名的情况下对其进行分组?

Simply use: 只需使用:

df = df[['qux', 'bar', 'foo']]

Example (Different DataFrame) 示例(不同的DataFrame)

Using a modification of the documentation for MultiIndex , here is something similar to your problem: 使用MultiIndex文档的修改,这与您的问题类似:

import pandas as pd
import numpy as np

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
   ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df = df.T

# Here is your insertion
df['foo', 'three'] = range(4)

>>> df[['bar', 'qux', 'foo']]
    bar     qux     foo
    one     two     one     two     one     two     three
0   0.450777    -1.386835   0.423801    -0.386144   0.362138    2.566733    0
1   0.844537    2.466605    -0.093472   0.226886    0.633393    2.167570    1
2   1.655898    0.995926    0.097128    -0.351759   0.138233    1.099168    2
3   0.409964    -1.232129   1.112228    0.700660    -0.860548   0.219503    3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM