简体   繁体   中英

pandas,read_excel, usecols with list input generating an empty dataframe

Actually i want to read only a specific column from excel into python dataframe my code is

import pandas as pd
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25",index_col="Developer",usecols="Name of Project")

but i am getting an empty dataframe as output, however when i use

import pandas as pd
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25",index_col="Developer",usecols=2)

I get the desired result,

As i have to do it for many files using a loop and location of the columns keeps on changing so i have to go by its name and not location.

Further i cant load full file in dataframe and use df["column_name"] as size of my excel file is too large (150 MB) and this will make my process very slow and sometime gives memory error.

Thanks in advance.

As mentioned by Tomas Farias, usecols doesn't take cell values. A possible approach is to read few rows and find the location of the column and then read the file second time.

import pandas as pd
col = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25", nrows=2).columns
k=col.get_loc('Name of Project')+1
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx", sheet_name="Age>25", index_col="Developer", usecols=k)

您可以将 .xlsx 文件保存/转换为 .csv,然后使用: pd.read_csv('filename.csv', usecols=[])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM