[英]Adding new set of columns to Pandas DataFrame with MultiIndex columns
以下內容似乎應該起作用,但不起作用:
import pandas as pd
import numpy as np
df = pd.DataFrame()
for l1 in ('a', 'b'):
for l2 in ('one', 'two'):
df[l1, l2] = np.random.random(size=5)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2'])
df['difference'] = df['b']-df['a']
我收到以下錯誤:
ValueError: Wrong number of items passed 2, placement implies 1
我可以通過以下方法解決此問題:
difference = df['b']-df['a']
df['difference', 'one'] = difference['one']
df['difference', 'two'] = difference['two']
但這似乎效率很低。 有沒有更有效的方法?
您可以一次性完成此操作:
In [11]: df[[("difference", "one"), ("difference", "two")]] = df['b'] - df['a']
In [12]: df
Out[12]:
L1 a b difference
L2 one two one two one two
0 0.585409 0.563870 0.535770 0.868020 -0.049639 0.304150
1 0.404546 0.102884 0.254945 0.362751 -0.149601 0.259868
2 0.475362 0.601632 0.476761 0.665126 0.001400 0.063495
3 0.926288 0.615655 0.257977 0.668778 -0.668311 0.053123
4 0.509069 0.706685 0.355842 0.891862 -0.153227 0.185177
通常,您可以使用MultiIndex,例如,生成from_product
:
In [21]: m = pd.MultiIndex.from_product(["difference", ["one", "two"]])
In [22]: df[m] = df['b'] - df['a']
RHS可以是結果列。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.