简体   繁体   中英

Why pandas read_excel not reading correctly xls file?

I am just trying to open the xls file with pandas with the following code:

import pandas as pd

frame = pd.read_excel('15_6z_12N_11.xlsx', skiprows=3)
df = pd.DataFrame(frame)
#pd.read_excel('your_excel.xlsx', , skip_blank_lines=False)

print(df)

and return is

     Unnamed: 0  185  ...  Unnamed: 254  Unnamed: 255
0           NaN  NaN  ...           NaN           NaN
1           NaN  NaN  ...           NaN           NaN
2           NaN  NaN  ...           NaN           NaN
3           NaN  NaN  ...           NaN           NaN
4           NaN  NaN  ...           NaN           NaN
..          ...  ...  ...           ...           ...
993         NaN  NaN  ...           NaN           NaN
994         NaN  NaN  ...           NaN           NaN
995         NaN  NaN  ...           NaN           NaN
996         NaN  NaN  ...           NaN           NaN
997         NaN  NaN  ...           NaN           NaN

when my file contains following data: Data from xls

Any idea why output is incorrect? Thanks

Here is xls file But sorry it is in russian language

Try this:

df = pd.read_excel('15_6z_12N_11.xlsx', header=[0,1,2]) #Read file, use 3 rows as header

First create DataFrame with specify sheetname, omit first 3 rows and next 3 rows convert to MultiIndex :

df = pd.read_excel('15_6z_12N_11.xls', sheet_name='PRINT', skiprows=3, header=[0,1,2])

Ant then is possible flatten Multiindex with remove Unnamed strings:

df.columns = ['_'.join(y for y in x if not 'Unnamed' in y) for x in df.columns.tolist()]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM