If I have the following:
df = pd.DataFrame(np.random.random((4,8)))
tupleList = zip([x for x in 'abcdefgh'], [y for y in ['iijjkkll'])
ind = pd.MultiIndex.from_tuples(tupleList)
df.columns = ind
In [71]: df
Out[71]:
a b c d e f g \
i i j j k k l
0 0.968112 0.809183 0.144320 0.518120 0.820079 0.648237 0.971552
1 0.959022 0.721705 0.139588 0.408940 0.230956 0.907192 0.467016
2 0.335085 0.537437 0.725119 0.486447 0.114048 0.150150 0.894322
3 0.051249 0.186547 0.779814 0.905914 0.024298 0.002489 0.339714
h
l
0 0.438330
1 0.225447
2 0.331413
3 0.530789
[4 rows x 8 columns]
what is the easiest way to select the columns that have a second level label of "j" or "k"?
c d e f
j j k k
0 0.948030 0.243993 0.627497 0.729024
1 0.087703 0.874968 0.581875 0.996466
2 0.802155 0.213450 0.375096 0.184569
3 0.164278 0.646088 0.201323 0.022498
I can do this:
df.loc[:, df.columns.get_level_values(1).isin(['j', 'k'])]
But that seems pretty verbose for something that feels like it should be simple. Any better approaches?
See here for multiindex using slicers, introduced in 0.14.0
In [36]: idx = pd.IndexSlice
In [37]: df.loc[:, idx[:, ['j', 'k']]]
Out[37]:
c d e f
j j k k
0 0.750582 0.877763 0.262696 0.226005
1 0.025902 0.967179 0.125647 0.297304
2 0.463544 0.104973 0.154113 0.284820
3 0.631695 0.841023 0.820907 0.938378
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.