簡體   English   中英

使用MultiIndex列向Pandas DataFrame添加新的列集

[英]Adding new set of columns to Pandas DataFrame with MultiIndex columns

以下內容似乎應該起作用,但不起作用:

import pandas as pd
import numpy as np

df = pd.DataFrame()
for l1 in ('a', 'b'):
    for l2 in ('one', 'two'):
        df[l1, l2] = np.random.random(size=5)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2'])

df['difference'] = df['b']-df['a']

我收到以下錯誤:

ValueError: Wrong number of items passed 2, placement implies 1

我可以通過以下方法解決此問題:

difference = df['b']-df['a']
df['difference', 'one'] = difference['one']
df['difference', 'two'] = difference['two']

但這似乎效率很低。 有沒有更有效的方法?

您可以一次性完成此操作:

In [11]: df[[("difference", "one"), ("difference", "two")]] = df['b'] - df['a']

In [12]: df
Out[12]:
L1         a                   b           difference
L2       one       two       one       two        one       two
0   0.585409  0.563870  0.535770  0.868020  -0.049639  0.304150
1   0.404546  0.102884  0.254945  0.362751  -0.149601  0.259868
2   0.475362  0.601632  0.476761  0.665126   0.001400  0.063495
3   0.926288  0.615655  0.257977  0.668778  -0.668311  0.053123
4   0.509069  0.706685  0.355842  0.891862  -0.153227  0.185177

通常,您可以使用MultiIndex,例如,生成from_product

In [21]: m = pd.MultiIndex.from_product(["difference", ["one", "two"]])

In [22]: df[m] = df['b'] - df['a']

RHS可以是結果列。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM