简体   繁体   中英

Pandas MultiIndex with None values

I am using a MultiIndex, with data coming from a database. Some of the values that I want to use as keys are null. I have found this results in the data being omitted. Eg

import numpy as np
import pandas as pd
import sys

print(sys.version)  # 3.7.3
print(pd.__version__) # 1.0.3
idx = pd.MultiIndex.from_tuples([('A', 'a'), ('A', 'b'), ('B', 'a'), ('B', ' '), ('C', 'a'), ('C', None), ('D', '')], names=['Level 1', 'Level 2'])
print(idx)
d = {'X':{('A','a'):1, ('A','b'):2, ('B','a'):3, ('B',' '):4, ('C','a'): 5, ('C',None): 6, ('D',''):7},
'Y':{('A','a'):1, ('C',None): 6, ('D',''):7}
}
df = pd.DataFrame(data=d, index=idx)
print(df)

The result is:

MultiIndex([('A', 'a'),
            ('A', 'b'),
            ('B', 'a'),
            ('B', ' '),
            ('C', 'a'),
            ('C', nan),
            ('D',  '')],
           names=['Level 1', 'Level 2'])
                   X    Y
Level 1 Level 2
A       a        1.0  1.0
        b        2.0  NaN
B       a        3.0  NaN
                 4.0  NaN
C       a        5.0  NaN
        NaN      NaN  NaN
D                7.0  7.0

My problem is the C/None row, which gives me NaN instead of 6. Other blankish values (empty string, space) don't have this behavior.

Is this to be expected or do I need to configure the MultiIndex in a certain way?

This is not safe when we have NaN in the index: github1 github2

For simple fix, you can load your data to dataframe, then fillna and set_index back

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM