简体   繁体   中英

Setting Multiple Layers of a Multiindex Series

TLDR: How do you set values in a multilevel list, by any slice. I got it to work on the outermost slice, but not if you along a "middle"

Suppose you have a 2 or 3 layer multi index Series that looks as follows:

_s01_|_s02_|_s03_|____
 'a' | 'c' | 'n' | 0.0
           | 'm' | 0.1
           | 'o' | 0.2
     | 'd' | 'n' | 0.3
           | 'o' | 0.4
 'b' | 'c' | 'n' | 0.5
        .........

Here is what I'm currently trying to do:

r = pd.Series(0,index - data.index) #so create a similar structure
for i in data.index.levels[1]:
    d = data.loc[(slice(None),i,slice(None)]
    #manipulate values in d
    r.loc[(slice(None),i,slice(None)] = d

This just sets all of the r values that are sliced into, to NaN .

Is there a universal way to VIEW into a multilevel indexed Series and set values? I was trying something very similar with a DataFrame and the issue that was causing the same problem was that .loc was dropping levels and then the indices weren't the same. I fixed the issue there by modifying the syntax to the one that now am attempting to use with series.

Any help would be greatly apprecaited

Pandas recommends using pd.IndexSlice or similar syntax rather than slice(). (See more documentation on slicers here. ), eg

explicitly:

idx = pd.IndexSlice
series.loc[idx[:, 'c', :]]

You could omit the idx step shortcut if you're just trying to get the entire entry of your selected rows: series.loc[:, 'c', :] (It's essentially what happens with simple indexing.)

However, it's better to use pd.IndexSlice, and necessary for more if you're trying to index in in a Dataframe.

Say we have your Series

series

>  s01  s02  s03
a    c    n      1
          m      0
          o      4
     d    n      6
          o      9
b    c    n      4
dtype: float64

Indexing on Multilevel indexes in pd.Series and pd.Dataframe

Key part

To do indexing, we need to first lexsort the series index:

series.sort_index(inplace = True)

Then, to do any indexing, we need a pd.IndexSlice object which defines the selection for .loc by:

idx = pd.IndexSlice
# do your indexing
series.loc[idx[:,'c',:]]

Details

Indexing on a Multilevel index doesn't work without pd.IndexSlice:

On a Series:

series.loc[[:,'c',:]]` will give you:

File "<ipython-input-101-21968807c1d1>", line 1
    df.loc[[:,'c',:]]
        ^
SyntaxError: invalid syntax


# with IndexSlice
idx = pd.IndexSlice
series.loc[idx[:,'c',:]]

>  s01  s03
a    n      1
     m      0
     o      4
b    n      4
dtype: int64

If we have a pd.DataFrame, we do a similar thing.

Say we have the following pd.Dataframe:

df
>              hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda     like
        o        4  Tiger     like
    d   n        6  Goose     like
        o        9   Bear  dislike
b   c   n        4   Dog  dislike

To index:

df.sort_index(inplace = True) # need to lexsort for indexing

# without pd.IndexSlice
df.loc[:,'c',:]   # the whole entry 
File "<ipython-input-118-9544c9b9f9da>", line 1
df.loc[(:,'c',:)]
        ^
SyntaxError: invalid syntax

# with pd.IndexSlice
idx = pd.IndexSlice
df.loc[idx[:,'c',:],:]

>             hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda     like
        o        4  Tiger     like
b   c   n        4   Dog  dislike

and for specific columns

df.loc[idx[:,'d',:],['hello','animal']]

>              hello animal
s01 s02 s03              
a   d   n        6  Goose
        o        9   Bear

Setting values

If you'd like to set value(s) on your selection, you can do it as per usual:

For a Series:

my_select = series.loc[idx[:,'c',:],:]
series.loc[idx[:,'c',:]] = my_select.apply(lambda x: x*3)

series
> s01  s02  s03
a    c    m       0
          n       3
          o      12
     d    n       6
          o       9
b    c    n      12
dtype: int64

For a Dataframe:

my_select = df.loc[idx[:,'d',:],:]
df.loc[idx[:,'d',:],['i_like']] = my_select.apply(
      lambda x: "dislike" if x.hello<5 else "like", axis=1)

df
>             hello animal   i_like
s01 s02 s03                       
a   c   m        0  Goose  dislike
        n        1  Panda  dislike
        o        4  Tiger     like
    d   n        6  Goose     like
        o        9   Bear  dislike
b   c   n        4   Dog     like

# Panda is changed to "dislike", and Dog to "like". 

PS. Note commas/colons (or lack thereof)!

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM