简体   繁体   中英

query pandas dataframe with integer in first level of multiindex

I'm having trouble with pandas MultiIndex, if the first index is an integer. I could not find this question, so maybe I'm doing something wrong here?

I use pandas version '0.16.2'

Example:

in:

data2 = pd.DataFrame(np.random.rand(10), 
                index = [['a','a','a','a','b','b','b','b','c','c'],
                         [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 0]])
data2.ix[['b','c']]

out:

            0
b 5  0.295579
  6  0.691801
  7  0.386504
  8  0.602777
c 9  0.269147
  0  0.029509

but with integers in the first index-level it doesn't seem to work:

data = pd.DataFrame(np.random.rand(10), 
                index = [[ 1 , 1 , 1 , 1 , 2 , 2 , 2 , 2 , 3 , 3], 
                         ['a','b','c','d','e','f','g','h','i','j']])
data.ix[[2,3]] 

out:

        0
1 c  0.437728
  d  0.785359

Use loc instead of ix :

data = pd.DataFrame(np.random.rand(10), 
                index = [[ 1 , 1 , 1 , 1 , 2 , 2 , 2 , 2 , 3 , 3], 
                         ['a','b','c','d','e','f','g','h','i','j']])
data.loc[[2,3]] 

In [264]: data.loc[[2,3]]
Out[264]: 
            0
2 e  0.846643
  f  0.200234
  g  0.298223
  h  0.766459
3 i  0.860181
  j  0.980182

For ix it's strange why it's not working because from docs :

However, when an axis is integer based, ONLY label based access and not positional access is supported Thus, in such cases, it's usually better to be explicit and use .iloc or .loc.

You index values are integer so they are should be analysed as labels:

In [271]: data.index.levels[0]
Out[271]: Int64Index([1, 2, 3], dtype='int64') 

But they recommended to use loc in such cases to be more explicit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM