pandas,read_excel, usecols with list input 生成一个空的数据框

Question

Actually i want to read only a specific column from excel into python dataframe my code is实际上我只想将excel中的特定列读取到python数据帧中，我的代码是

import pandas as pd
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25",index_col="Developer",usecols="Name of Project")

but i am getting an empty dataframe as output, however when i use但我得到一个空数据帧作为输出，但是当我使用

import pandas as pd
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25",index_col="Developer",usecols=2)

I get the desired result,我得到了想要的结果，

As i have to do it for many files using a loop and location of the columns keeps on changing so i have to go by its name and not location.由于我必须使用循环对许多文件执行此操作，并且列的位置不断变化，因此我必须按其名称而不是位置。

Further i cant load full file in dataframe and use df["column_name"] as size of my excel file is too large (150 MB) and this will make my process very slow and sometime gives memory error.此外，我无法在数据框中加载完整文件并使用df["column_name"]作为我的 excel 文件的大小太大（150 MB），这将使我的过程非常缓慢，有时会出现内存错误。

Thanks in advance.提前致谢。

Answer 1

As mentioned by Tomas Farias, usecols doesn't take cell values.正如 Tomas Farias 所提到的，usecols 不接受单元格值。 A possible approach is to read few rows and find the location of the column and then read the file second time.一种可能的方法是读取几行并找到列的位置，然后第二次读取文件。

import pandas as pd
col = pd.read_excel("3_Plants sorted on PLF age cost.xlsx",sheet_name="Age>25", nrows=2).columns
k=col.get_loc('Name of Project')+1
file = pd.read_excel("3_Plants sorted on PLF age cost.xlsx", sheet_name="Age>25", index_col="Developer", usecols=k)

Answer 2

您可以将 .xlsx 文件保存/转换为 .csv，然后使用： pd.read_csv('filename.csv', usecols=[])

pandas,read_excel, usecols with list input 生成一个空的数据框

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-07-07 03:55:47

解决方案2
0 2020-06-25 03:50:47

pandas,read_excel, usecols with list input 生成一个空的数据框

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-07-07 03:55:47

解决方案2 0 2020-06-25 03:50:47

解决方案1
0 已采纳 2018-07-07 03:55:47

解决方案2
0 2020-06-25 03:50:47