[英]read a specific cells from multiple excel spreadsheets into single pandas dataframe
I would like to read specific cells from multiple excel spreadsheets into single pandas dataframe.我想将多个 excel 电子表格中的特定单元格读取到单个 pandas dataframe 中。
so far, I have tried this.到目前为止,我已经尝试过了。 (without a success) (没有成功)
import pandas as pd
import glob
import xlrd
file_list = glob.glob("*.xls")
df = pd.DataFrame()
for f in file_list:
wb = xlrd.open_workbook(f)
sheet = wb.sheet_by_index(0)
name = sheet.cell_value(rowx=9, colx=2)
city = sheet.cell_value(rowx=15, colx=2)
df = df.append([name,city])
Desired output is pandas dataframe as this所需的 output 是 pandas dataframe
name city
Tom NY
Alex Toronto
Anne Atlanta
... ...
Thanks谢谢
I think you need two sets of [[]]
around what is being appended.我认为你需要两组[[]]
围绕正在附加的内容。 With one set of brackets, it tries to add name as a row and city as a row, rather than as columns in the same row.使用一组括号,它尝试将名称添加为一行,将城市添加为一行,而不是作为同一行中的列。
import pandas as pd
import glob
import xlrd
file_list = glob.glob("*.xls")
df = pd.DataFrame()
for f in file_list:
wb = xlrd.open_workbook(f)
sheet = wb.sheet_by_index(0)
name = sheet.cell_value(rowx=9, colx=2)
city = sheet.cell_value(rowx=15, colx=2)
df = df.append([[name,city]])
This will have columns named 0
and 1
, though (since you didn't define names in the creation of the dataframe), so a last step would be to rename those:不过,这将有名为0
和1
的列(因为您没有在创建数据框时定义名称),所以最后一步是重命名这些:
df = df.rename(columns={0:'name',1:'city'})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.