按索引级别给 Pandas Multiindex DataFrame 赋值

Question

I have a Pandas multiindex dataframe and I need to assign values to one of the columns from a series.我有一个 Pandas 多索引 dataframe，我需要将值分配给系列中的其中一列。 The series shares its index with the first level of the index of the dataframe.该系列与 dataframe 索引的第一级共享其索引。

import pandas as pd
import numpy as np
idx0 = np.array(['bar', 'bar', 'bar', 'baz', 'foo', 'foo'])
idx1 = np.array(['one', 'two', 'three', 'one', 'one', 'two'])
df = pd.DataFrame(index = [idx0, idx1], columns = ['A', 'B'])
s = pd.Series([True, False, True],index = np.unique(idx0))
print df
print s

out:出去：

             A    B
bar one    NaN  NaN
    two    NaN  NaN
    three  NaN  NaN
baz one    NaN  NaN
foo one    NaN  NaN
    two    NaN  NaN

bar     True
baz    False
foo     True
dtype: bool

These don't work:这些不起作用：

df.A = s # does not raise an error, but does nothing
df.loc[s.index,'A'] = s # raises an error

expected output:预计 output：

             A     B
bar one    True   NaN
    two    True   NaN
    three  True   NaN
baz one    False  NaN
foo one    True   NaN
    two    True   NaN

Answer 1

Series (and dictionaries) can be used just like functions with map and apply (thanks to @normanius for improving the syntax): 系列（和字典）可以像map和apply一样使用函数（感谢@normanius改进语法）：

df['A'] = pd.Series(df.index.get_level_values(0)).map(s).values

Or similarly: 或类似地：

df['A'] = df.reset_index(level=0)['level_0'].map(s).values

Results: 结果：

A    B
bar one     True  NaN
    two     True  NaN
    three   True  NaN
baz one    False  NaN
foo one     True  NaN
    two     True  NaN

Answer 2

df.A = s does not raise an error, but does nothing df.A = s不会引发错误，但什么也不做

Indeed this should have worked.确实这应该有效。 ^{Your point is actually related to mine .}^{你的观点实际上与我的观点有关。}

ᐊᐊ The workaround ᐊᐊ ᐊᐊ解决方法ᐊᐊ

>>> s.index = pd.Index((c,) for c in s.index)  # ᐊᐊᐊᐊᐊᐊᐊᐊ
>>> df.A = s
>>> df
               A    B
bar one     True  NaN
    two     True  NaN
    three   True  NaN
baz one    False  NaN
foo one     True  NaN
    two     True  NaN

Why does the above work?为什么上面的工作？

Because when you do directly df.A = s without the workaround , you are actually trying to assign pandas.Index -contained coordinates within a subclass instance, ^{which somehow looks like a "counter-opposition" to the LS principle} ie an instance of pandas.MultiIndex .因为当您直接df.A = s而没有解决方法时，您实际上是在尝试在子类实例中分配pandas.Index的坐标，^{这在某种程度上看起来像是对LS 原则的“反反对”，}即pandas.MultiIndex的实例pandas.MultiIndex 。 I mean, look for yourself:我的意思是，寻找你自己：

>>> type(s.index).__name__
'Index'

whereas然而

>>> type(df.index).__name__
'MultiIndex'

Hence this workaround that consists in turning s 's index into a 1-dimensional pandas.MultiIndex instance.因此，此解决方法包括将s的索引转换为一维pandas.MultiIndex实例。

>>> s.index = pd.Index((c,) for c in s.index)
>>> type(s.index).__name__
'MultiIndex'

and nothing has perceptibly changed一切都没有明显改变

>>> s
bar     True
baz    False
foo     True
dtype: bool

A thought: From many views (mathematical, ontological) all this somehow shows that pandas.Index should have been designed as a subclass of pandas.MultiIndex , not the opposite, as it is currently.一个想法：从许多观点（数学，本体论）来看，所有这些都以某种方式表明pandas.Index应该被设计为pandas.MultiIndex的子类，而不是像现在这样相反。

Answer 3

You can use the join method on the df DataFrame, but you need to name the indexes and the series accordingly:您可以在df DataFrame 上使用join方法，但您需要相应地命名索引和系列：

>>> df.index.names = ('lvl0', 'lvl1')
>>> s.index.name = 'lvl0'
>>> s.name = 'new_col'

Then the join method creates a new column in the DataFrame:然后 join 方法在 DataFrame 中创建一个新列：

>>> df.join(s)
              A    B  new_col
lvl0 lvl1
bar  one    NaN  NaN     True
     two    NaN  NaN     True
     three  NaN  NaN     True
baz  one    NaN  NaN    False
foo  one    NaN  NaN     True
     two    NaN  NaN     True

To assign it to an existing column:要将其分配给现有列：

>>> df['A'] = df.join(s)['new_col']
>>> df
                A    B
lvl0 lvl1
bar  one     True  NaN
     two     True  NaN
     three   True  NaN
baz  one    False  NaN
foo  one     True  NaN
     two     True  NaN

按索引级别给 Pandas Multiindex DataFrame 赋值

问题描述

3 个解决方案

解决方案1
6 2015-05-08 12:51:49

解决方案2
2 已采纳 2021-07-31 15:15:19

解决方案3
0 2023-01-04 09:43:27

按索引级别给 Pandas Multiindex DataFrame 赋值

问题描述

3 个解决方案

解决方案1 6 2015-05-08 12:51:49

解决方案2 2 已采纳 2021-07-31 15:15:19

解决方案3 0 2023-01-04 09:43:27

解决方案1
6 2015-05-08 12:51:49

解决方案2
2 已采纳 2021-07-31 15:15:19

解决方案3
0 2023-01-04 09:43:27