简体   繁体   中英

missing values in pandas column multiindex

I am reading with pandas excel sheets like this one:

在此处输入图像描述

using

df = pd.read_excel('./question.xlsx', sheet_name = None, header = [0,1])

which results in multiindex dataframe with multiindex.

在此处输入图像描述

What poses a problem here is that the empty fields are filled by default with 'Title' , whereas I would prefer to use a distinct label. I cannot skip the first row since I am dealing with bigger data frames where the first and the second rows contain repeating labels (hence the use of the multiindex).

Your help will be much appreciated.

Assuming that you want to have empty strings instead of repeating the first label, you can read the 2 lines and build the MultiIndex directly:

df1 = pd.read_excel('./question.xlsx', header = None, nrows=2).fillna('')
index = pd.MultiIndex.from_arrays(df1.values)

it gives:

MultiIndex([('Title',        '#'),
            (     '',    'Price'),
            (     '', 'Quantity')],
           )

By the way, if you wanted a different label for empty fields, you can just use it as the parameter for fillna .

Then, you just read the remaining data, and set the index by hand:

df1 = pd.read_excel('./question.xlsx', header = None, skiprows=2)
df1.columns = index

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM