TLDR: How do you set values in a multilevel list, by any slice. I got it to work on the outermost slice, but not if you along a "middle"
Suppose you have a 2 or 3 layer multi index Series that looks as follows:
_s01_|_s02_|_s03_|____
'a' | 'c' | 'n' | 0.0
| 'm' | 0.1
| 'o' | 0.2
| 'd' | 'n' | 0.3
| 'o' | 0.4
'b' | 'c' | 'n' | 0.5
.........
Here is what I'm currently trying to do:
r = pd.Series(0,index - data.index) #so create a similar structure
for i in data.index.levels[1]:
d = data.loc[(slice(None),i,slice(None)]
#manipulate values in d
r.loc[(slice(None),i,slice(None)] = d
This just sets all of the r
values that are sliced into, to NaN
.
Is there a universal way to VIEW into a multilevel indexed Series and set values? I was trying something very similar with a DataFrame and the issue that was causing the same problem was that .loc
was dropping levels and then the indices weren't the same. I fixed the issue there by modifying the syntax to the one that now am attempting to use with series.
Any help would be greatly apprecaited
Pandas recommends using pd.IndexSlice
or similar syntax rather than slice(). (See more documentation on slicers here. ), eg
explicitly:
idx = pd.IndexSlice
series.loc[idx[:, 'c', :]]
You could omit the idx step shortcut if you're just trying to get the entire entry of your selected rows: series.loc[:, 'c', :]
(It's essentially what happens with simple indexing.)
However, it's better to use pd.IndexSlice, and necessary for more if you're trying to index in in a Dataframe.
Say we have your Series
series
> s01 s02 s03
a c n 1
m 0
o 4
d n 6
o 9
b c n 4
dtype: float64
To do indexing, we need to first lexsort the series index:
series.sort_index(inplace = True)
Then, to do any indexing, we need a pd.IndexSlice object which defines the selection for .loc by:
idx = pd.IndexSlice
# do your indexing
series.loc[idx[:,'c',:]]
Indexing on a Multilevel index doesn't work without pd.IndexSlice:
On a Series:
series.loc[[:,'c',:]]` will give you:
File "<ipython-input-101-21968807c1d1>", line 1
df.loc[[:,'c',:]]
^
SyntaxError: invalid syntax
# with IndexSlice
idx = pd.IndexSlice
series.loc[idx[:,'c',:]]
> s01 s03
a n 1
m 0
o 4
b n 4
dtype: int64
If we have a pd.DataFrame, we do a similar thing.
Say we have the following pd.Dataframe:
df
> hello animal i_like
s01 s02 s03
a c m 0 Goose dislike
n 1 Panda like
o 4 Tiger like
d n 6 Goose like
o 9 Bear dislike
b c n 4 Dog dislike
To index:
df.sort_index(inplace = True) # need to lexsort for indexing
# without pd.IndexSlice
df.loc[:,'c',:] # the whole entry
File "<ipython-input-118-9544c9b9f9da>", line 1
df.loc[(:,'c',:)]
^
SyntaxError: invalid syntax
# with pd.IndexSlice
idx = pd.IndexSlice
df.loc[idx[:,'c',:],:]
> hello animal i_like
s01 s02 s03
a c m 0 Goose dislike
n 1 Panda like
o 4 Tiger like
b c n 4 Dog dislike
and for specific columns
df.loc[idx[:,'d',:],['hello','animal']]
> hello animal
s01 s02 s03
a d n 6 Goose
o 9 Bear
If you'd like to set value(s) on your selection, you can do it as per usual:
For a Series:
my_select = series.loc[idx[:,'c',:],:]
series.loc[idx[:,'c',:]] = my_select.apply(lambda x: x*3)
series
> s01 s02 s03
a c m 0
n 3
o 12
d n 6
o 9
b c n 12
dtype: int64
For a Dataframe:
my_select = df.loc[idx[:,'d',:],:]
df.loc[idx[:,'d',:],['i_like']] = my_select.apply(
lambda x: "dislike" if x.hello<5 else "like", axis=1)
df
> hello animal i_like
s01 s02 s03
a c m 0 Goose dislike
n 1 Panda dislike
o 4 Tiger like
d n 6 Goose like
o 9 Bear dislike
b c n 4 Dog like
# Panda is changed to "dislike", and Dog to "like".
PS. Note commas/colons (or lack thereof)!
Hope this helps!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.