Why pandas read_excel not reading correctly xls file?

Question

I am just trying to open the xls file with pandas with the following code:

import pandas as pd

frame = pd.read_excel('15_6z_12N_11.xlsx', skiprows=3)
df = pd.DataFrame(frame)
#pd.read_excel('your_excel.xlsx', , skip_blank_lines=False)

print(df)

and return is

     Unnamed: 0  185  ...  Unnamed: 254  Unnamed: 255
0           NaN  NaN  ...           NaN           NaN
1           NaN  NaN  ...           NaN           NaN
2           NaN  NaN  ...           NaN           NaN
3           NaN  NaN  ...           NaN           NaN
4           NaN  NaN  ...           NaN           NaN
..          ...  ...  ...           ...           ...
993         NaN  NaN  ...           NaN           NaN
994         NaN  NaN  ...           NaN           NaN
995         NaN  NaN  ...           NaN           NaN
996         NaN  NaN  ...           NaN           NaN
997         NaN  NaN  ...           NaN           NaN

when my file contains following data: Data from xls

Any idea why output is incorrect? Thanks

Here is xls file But sorry it is in russian language

Answer 1

Try this:

df = pd.read_excel('15_6z_12N_11.xlsx', header=[0,1,2]) #Read file, use 3 rows as header

Answer 2

First create DataFrame with specify sheetname, omit first 3 rows and next 3 rows convert to MultiIndex :

df = pd.read_excel('15_6z_12N_11.xls', sheet_name='PRINT', skiprows=3, header=[0,1,2])

Ant then is possible flatten Multiindex with remove Unnamed strings:

df.columns = ['_'.join(y for y in x if not 'Unnamed' in y) for x in df.columns.tolist()]

Why pandas read_excel not reading correctly xls file?

Question

2 answers

solution1
0 2020-05-20 06:37:56

solution2
0 ACCPTED 2020-05-20 06:53:19

Why pandas read_excel not reading correctly xls file?

Question

2 answers

solution1 0 2020-05-20 06:37:56

solution2 0 ACCPTED 2020-05-20 06:53:19

solution1
0 2020-05-20 06:37:56

solution2
0 ACCPTED 2020-05-20 06:53:19