简体   繁体   中英

Pandas won't allow selecting a column if there's a multi-index?

I'm debugging some pandas code that accidentally created a MultiIndex instead of a regular index. Due to the multi-index, Pandas won't allow selecting a column. In this case, I can just get rid of the MultiIndex but if I did need that MultiIndex, how can you select a column? Additional info -- I'm getting this error with pandas 0.25.1 but this code was in a notebook somebody wrote years ago so apparently it used to work with older versions?

import numpy as np
import pandas as pd

names = ['FirstColumn', 'SecondColumn']
data = np.array([[5,6],[7,8]])
df = pd.DataFrame(data, columns = [names]) #Bug: this "works" but isn't what you want.
#The brackets around "[names]" creates a multi-index but that was unintentional.
#But "df.head()" and "df.describe()" both look normal so you can't see anything is wrong. 

df['FirstColumn'] #ERROR! works fine with a single index, but fails with multiindex
df.FirstColumn #ERROR! works fine with a single index, but fails with multiindex
df.loc[:,'FirstColumn'] #ERROR! works fine with a single index, but fails with multiindex

Both of those statements give misleading errors about only integer scalar arrays can be converted to a scalar index So how can you select the column when there's a multiindex? I know some tricks like unstack or changing the index, etc; but seems like there ought to be a simple way?

UPDATE: Turns out this worked fine in pandas 0.22.0 but fails in 0.25.1. Looks a regression bug was introduced. I've reported it on the pandas github.

Use DataFrame.xs function:

print (df.xs('FirstColumn', axis=1, level=0))
  FirstColumn
0           5
1           7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM