简体   繁体   English

Pandas:将多列添加到多索引列 dataframe

[英]Pandas: add multiple columns to a multiindex column dataframe

This question is an attempt to generalise the solution provided for the this question:这个问题试图概括为这个问题提供的解决方案:

Pandas: add a column to a multiindex column dataframe Pandas:向多索引列添加一列 dataframe

I need to produce a column for each column index.我需要为每个列索引生成一列。

The solution provided by spencerlyon2 works when we want to add a single column: spencerlyon2提供的解决方案适用于我们要添加单列时:

df['bar', 'three'] = [0, 1, 2]

However I would like to generalise this operation for every first level column index.但是,我想为每个第一级列索引概括此操作。

Source DF:来源 DF:

In [1]: df
Out[2]:
first        bar                 baz
second       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740
B       0.488875  0.428836  1.413451 -0.683677
C      -0.243064 -0.069446 -0.911166  0.478370

Target DF below, requires that the three column is the addition of the one and two columns of its respective index.下面的目标DF,要求three列是其各自索引的one列和two列的相加。

In [1]: df
Out[2]:
first        bar                           baz                 
second       one       two     three       one       two      three
A      -1.089798  2.053026  0.963228‬  1.440740 -2.317647  -0.876907‬
B       0.488875  0.428836  0.917711 -0.683677  0.345873  -0.337804‬
C      -0.243064 -0.069446 -0.312510  0.478370  0.266761   0.745131‬

You can use join with two data frames with same indexes to create a bunch of columns all at once.您可以使用join两个具有相同索引的数据框来一次创建一堆列。


First, calculate the sum using groupby against axis=1首先,使用groupbyaxis=1计算总和

ndf = df.groupby(df.columns.get_level_values(0), axis=1).sum()

        bar       baz
A  0.963228  1.910958
B  0.917711  0.729774
C -0.312510 -0.432796

(PS: If you have more than two columns, you may do (PS:如果你有两个以上的列,你可以这样做

df.loc[:, (slice(None), ['one', 'two'])].groupby(df.columns.get_level_values(0), axis=1).sum()

to slice only columns 'one' and 'two' first, and just then groupby )先只切片“一”和“二”列,然后groupby

Then, make it match your column indexes, ie make it a MultiIndexed data frame just like your original data frame然后,使其与您的列索引匹配,即使其成为 MultiIndexed 数据框,就像您的原始数据框一样

ndf.columns = pd.MultiIndex.from_product([ndf.columns, ['three']])

        bar       baz
      three     three
A  0.963228  1.910958
B  0.917711  0.729774
C -0.312510 -0.432796

Finally, df.join最后, df.join

finaldf = df.join(ndf).sort_index(axis=1)

If you really care about the ordering, use reindex如果您真的关心排序,请使用reindex

finaldf.reindex(['one', 'two', 'three'], axis=1, level=1)

first        bar                           baz                    
second       one       two     three       one       two     three
A      -1.089798  2.053026  0.963228  0.470218  1.440740  1.910958
B       0.488875  0.428836  0.917711  1.413451 -0.683677  0.729774
C      -0.243064 -0.069446 -0.312510 -0.911166  0.478370 -0.432796

I started from your sample input:我从您的示例输入开始:

first        bar                 baz          
second       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740
B       0.488875  0.428836  1.413451 -0.683677
C      -0.243064 -0.069446 -0.911166  0.478370

To add a new column to each level 0 of the column MultiIndex, you can run something like:要将新列添加到列 MultiIndex 的每个级别 0,您可以运行以下命令:

for c1 in df.columns.get_level_values('first').unique():
    # New column int index
    cInd = int(df.columns.get_loc(c1).stop)
    col = (c1, 'three')      # New column name
    newVal = df[(c1, 'one')] + df[(c1, 'two')]
    df.insert(loc=cInd, column=col, value=newVal)  # Insert the new column

In the above example, values in new columns are consecutive numbers, but in your case set them as you wish.在上面的示例中,新列中的值是连续的数字,但在您的情况下,可以根据需要设置它们。

The result of my code (after the column sort) is:我的代码的结果(在列排序之后)是:

first        bar                           baz                    
second       one       two     three       one       two     three
A      -1.089798  2.053026  0.963228  0.470218  1.440740  1.910958
B       0.488875  0.428836  0.917711  1.413451 -0.683677  0.729774
C      -0.243064 -0.069446 -0.312510 -0.911166  0.478370 -0.432796

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM