按索引級別給 Pandas Multiindex DataFrame 賦值

Question

我有一個 Pandas 多索引 dataframe，我需要將值分配給系列中的其中一列。 該系列與 dataframe 索引的第一級共享其索引。

import pandas as pd
import numpy as np
idx0 = np.array(['bar', 'bar', 'bar', 'baz', 'foo', 'foo'])
idx1 = np.array(['one', 'two', 'three', 'one', 'one', 'two'])
df = pd.DataFrame(index = [idx0, idx1], columns = ['A', 'B'])
s = pd.Series([True, False, True],index = np.unique(idx0))
print df
print s

出去：

             A    B
bar one    NaN  NaN
    two    NaN  NaN
    three  NaN  NaN
baz one    NaN  NaN
foo one    NaN  NaN
    two    NaN  NaN

bar     True
baz    False
foo     True
dtype: bool

這些不起作用：

df.A = s # does not raise an error, but does nothing
df.loc[s.index,'A'] = s # raises an error

預計 output：

             A     B
bar one    True   NaN
    two    True   NaN
    three  True   NaN
baz one    False  NaN
foo one    True   NaN
    two    True   NaN

Answer 1

系列（和字典）可以像map和apply一樣使用函數（感謝@normanius改進語法）：

df['A'] = pd.Series(df.index.get_level_values(0)).map(s).values

或類似地：

df['A'] = df.reset_index(level=0)['level_0'].map(s).values

結果：

A    B
bar one     True  NaN
    two     True  NaN
    three   True  NaN
baz one    False  NaN
foo one     True  NaN
    two     True  NaN

Answer 2

df.A = s不會引發錯誤，但什么也不做

確實這應該有效。 ^{你的觀點實際上與我的觀點有關。}

ᐊᐊ解決方法ᐊᐊ

>>> s.index = pd.Index((c,) for c in s.index)  # ᐊᐊᐊᐊᐊᐊᐊᐊ
>>> df.A = s
>>> df
               A    B
bar one     True  NaN
    two     True  NaN
    three   True  NaN
baz one    False  NaN
foo one     True  NaN
    two     True  NaN

為什么上面的工作？

因為當您直接df.A = s而沒有解決方法時，您實際上是在嘗試在子類實例中分配pandas.Index的坐標，^{這在某種程度上看起來像是對LS 原則的“反反對”，}即pandas.MultiIndex的實例pandas.MultiIndex 。 我的意思是，尋找你自己：

>>> type(s.index).__name__
'Index'

然而

>>> type(df.index).__name__
'MultiIndex'

因此，此解決方法包括將s的索引轉換為一維pandas.MultiIndex實例。

>>> s.index = pd.Index((c,) for c in s.index)
>>> type(s.index).__name__
'MultiIndex'

一切都沒有明顯改變

>>> s
bar     True
baz    False
foo     True
dtype: bool

一個想法：從許多觀點（數學，本體論）來看，所有這些都以某種方式表明pandas.Index應該被設計為pandas.MultiIndex的子類，而不是像現在這樣相反。

Answer 3

您可以在df DataFrame 上使用join方法，但您需要相應地命名索引和系列：

>>> df.index.names = ('lvl0', 'lvl1')
>>> s.index.name = 'lvl0'
>>> s.name = 'new_col'

然后 join 方法在 DataFrame 中創建一個新列：

>>> df.join(s)
              A    B  new_col
lvl0 lvl1
bar  one    NaN  NaN     True
     two    NaN  NaN     True
     three  NaN  NaN     True
baz  one    NaN  NaN    False
foo  one    NaN  NaN     True
     two    NaN  NaN     True

要將其分配給現有列：

>>> df['A'] = df.join(s)['new_col']
>>> df
                A    B
lvl0 lvl1
bar  one     True  NaN
     two     True  NaN
     three   True  NaN
baz  one    False  NaN
foo  one     True  NaN
     two     True  NaN

按索引級別給 Pandas Multiindex DataFrame 賦值

問題描述

3 個解決方案

解決方案1
6 2015-05-08 12:51:49

解決方案2
2 已采納 2021-07-31 15:15:19

解決方案3
0 2023-01-04 09:43:27

按索引級別給 Pandas Multiindex DataFrame 賦值

問題描述

3 個解決方案

解決方案1 6 2015-05-08 12:51:49

解決方案2 2 已采納 2021-07-31 15:15:19

解決方案3 0 2023-01-04 09:43:27

解決方案1
6 2015-05-08 12:51:49

解決方案2
2 已采納 2021-07-31 15:15:19

解決方案3
0 2023-01-04 09:43:27