简体   繁体   中英

multiindex selecting in pandas

I have problems understanding multiindex selecting in pandas.

                    0  1  2  3
first second third            
C     one    mean   3  4  2  7
             std    4  1  7  7
      two    mean   3  1  4  7
             std    5  6  7  0
      three  mean   7  0  2  5
             std    7  3  7  1
H     one    mean   2  4  3  3
             std    5  5  3  5
      two    mean   5  7  0  6
             std    0  1  0  2
      three  mean   5  2  5  1
             std    9  0  4  6
V     one    mean   3  7  3  9
             std    8  7  9  3
      two    mean   1  9  9  0
             std    1  1  5  1
      three  mean   3  1  0  6
             std    6  2  7  4

I need to create new rows:

- 'CH' : ['CH',:,'mean'] => ['C',:,'mean'] - ['H',:,'mean']
- 'CH' : ['CH',:,'std'] => (['C',:,'std']**2 + ['H',:,'std']**2)**.5

When trying to select rows I get different types of errors: UnsortedIndexError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (3), lexsort depth (1)'

How should be performed this operation?

import pandas as pd
import numpy as np
iterables = [['C', 'H', 'V'],
          ['one','two','three'],
          ['mean','std']]
midx = pd.MultiIndex.from_product(iterables, names=['first', 'second','third'])
chv = pd.DataFrame(np.random.randint(0,high=10,size=(18,4)), index=midx)
print (chv)
idx = pd.IndexSlice
chv.loc[:,idx['C',:,'mean']]

You can filter by slicers first, then rename first level and use arithmetic operations, last concat together:

#avoid UnsortedIndexError
df = df.sort_index()

idx = pd.IndexSlice
c1 = chv.loc[idx['C',:,'mean'], :].rename({'C':'CH'}, level=0)
h1 = chv.loc[idx['H',:,'mean'], :].rename({'H':'CH'}, level=0)
ch1 = c1 - h1

c2 = chv.loc[idx['C',:,'std'], :].rename({'C':'CH'}, level=0)**2
h2 = chv.loc[idx['H',:,'std'], :].rename({'H':'CH'}, level=0)**2
ch2 = (c2 + h2)**.5

df = pd.concat([chv, ch1, ch2]).sort_index()

print (df)
                           0         1         2         3
first second third                                        
C     one    mean   7.000000  5.000000  8.000000  3.000000
             std    0.000000  4.000000  4.000000  4.000000
      three  mean   4.000000  2.000000  1.000000  6.000000
             std    8.000000  7.000000  3.000000  3.000000
      two    mean   1.000000  8.000000  2.000000  5.000000
             std    2.000000  2.000000  4.000000  2.000000
CH    one    mean   1.000000  2.000000  1.000000  2.000000
             std    4.000000  7.211103  4.000000  7.211103
      three  mean   1.000000  0.000000 -4.000000  2.000000
             std    8.062258  7.071068  4.242641  3.000000
      two    mean  -1.000000  6.000000 -2.000000  3.000000
             std    9.219544  7.280110  4.123106  2.000000
H     one    mean   6.000000  3.000000  7.000000  1.000000
             std    4.000000  6.000000  0.000000  6.000000
      three  mean   3.000000  2.000000  5.000000  4.000000
             std    1.000000  1.000000  3.000000  0.000000
      two    mean   2.000000  2.000000  4.000000  2.000000
             std    9.000000  7.000000  1.000000  0.000000
V     one    mean   9.000000  5.000000  0.000000  5.000000
             std    7.000000  9.000000  1.000000  1.000000
      three  mean   3.000000  0.000000  3.000000  4.000000
             std    1.000000  4.000000  9.000000  2.000000
      two    mean   3.000000  6.000000  3.000000  2.000000
             std    1.000000  3.000000  1.000000  4.000000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM