如何在 multiIndex 級別中減去二級列 dataframe

Question

這是我正在使用的示例數據。 我想要完成的是 1) 從 a 列減去 b 列和 2) 在 a 和 b 列前面創建 C 列。 我想遍歷並為 x、y 和 z 創建 C 列。

import pandas as pd
df = pd.DataFrame(data=[[100,200,400,500,111,222], [77,28,110,211,27,81], [11,22,33,11,22,33],[213,124,136,147,54,56]])
df.columns = pd.MultiIndex.from_product([['x', 'y', 'z'], list('ab')])
print (df)

以下是我想要得到的。

Answer 1

將DataFrame.xs用於 select 第二級，避免使用drop_level=False刪除第一級，然后對相同的 MultiIndex 使用rename ，使用concat減去並添加到原始，最后使用DataFrame.sort_index ：

dfa = df.xs('a', axis=1, level=1, drop_level=False).rename(columns={'a':'c'})
dfb = df.xs('b', axis=1, level=1, drop_level=False).rename(columns={'b':'c'})

df = pd.concat([df, dfa.sub(dfb)], axis=1).sort_index(axis=1)
print (df)
     x              y              z          
     a    b    c    a    b    c    a    b    c
0  100  200 -100  400  500 -100  111  222 -111
1   77   28   49  110  211 -101   27   81  -54
2   11   22  -11   33   11   22   22   33  -11
3  213  124   89  136  147  -11   54   56   -2

通過元組循環 select 列，減去Series並最后使用DataFrame.sort_index ：

for c in df.columns.levels[0]:
    df[(c, 'c')] = df[(c, 'a')].sub(df[(c, 'b')])

df = df.sort_index(axis=1)
print (df)
     x              y              z          
     a    b    c    a    b    c    a    b    c
0  100  200 -100  400  500 -100  111  222 -111
1   77   28   49  110  211 -101   27   81  -54
2   11   22  -11   33   11   22   22   33  -11
3  213  124   89  136  147  -11   54   56   -2

Answer 2

a = df.xs('a', level=1, axis=1)
b = df.xs('b', level=1, axis=1)
df1 = pd.concat([a.sub(b)], keys=['c'], axis=1).swaplevel(0, 1, axis=1)

df1

    x       y       z
    c       c       c
0   -100    -100    -111
1   49      -101    -54
2   -11       22    -11
3   89       -11    -2

然后首先連接 df 和 df1，然后排序

pd.concat([df, df1], axis=1).sort_index(axis=1)

另一種方式

使用堆棧和取消堆棧

df.stack(level=0).assign(c=lambda x: x['b'] - x['a']).stack().unstack([1, 2])

結果：

    x           y           z
    a   b   c   a   b   c   a   b   c
0   100 200 100 400 500 100 111 222 111
1   77  28  -49 110 211 101 27  81  54
2   11  22  11  33  11  -22 22  33  11
3   213 124 -89 136 147 11  54  56  2

Answer 3

轉儲到 numpy，構建一個新的 dataframe，並連接到原來的 dataframe：

result = df.loc(axis=1)[:,'a'].to_numpy() - df.loc(axis=1)[:, 'b'].to_numpy()
header = pd.MultiIndex.from_product([['x','y','z'], ['c']])
result = pd.DataFrame(result, columns=header)
pd.concat([df, result], axis=1).sort_index(axis=1)

     x              y              z
     a    b    c    a    b    c    a    b    c
0  100  200 -100  400  500 -100  111  222 -111
1   77   28   49  110  211 -101   27   81  -54
2   11   22  -11   33   11   22   22   33  -11
3  213  124   89  136  147  -11   54   56   -2

另一種選擇，使用pipe ，而不轉儲到 numpy：

result = df.swaplevel(axis=1).pipe(lambda df: df['a'] - df['b'])
result.columns = pd.MultiIndex.from_product([result.columns, ['c']])
pd.concat([df, result], axis=1).sort_index(axis=1)

     x              y              z
     a    b    c    a    b    c    a    b    c
0  100  200 -100  400  500 -100  111  222 -111
1   77   28   49  110  211 -101   27   81  -54
2   11   22  -11   33   11   22   22   33  -11
3  213  124   89  136  147  -11   54   56   -2

如何在 multiIndex 級別中減去二級列 dataframe

問題描述

3 個解決方案

解決方案1
2 已采納 2022-11-29 06:12:31

解決方案2
2 2022-11-29 06:17:54

解決方案3
1 2022-11-29 09:28:20

如何在 multiIndex 級別中減去二級列 dataframe

問題描述

3 個解決方案

解決方案1 2 已采納 2022-11-29 06:12:31

解決方案2 2 2022-11-29 06:17:54

解決方案3 1 2022-11-29 09:28:20

解決方案1
2 已采納 2022-11-29 06:12:31

解決方案2
2 2022-11-29 06:17:54

解決方案3
1 2022-11-29 09:28:20