简体   繁体   中英

read a specific cells from multiple excel spreadsheets into single pandas dataframe

I would like to read specific cells from multiple excel spreadsheets into single pandas dataframe.

so far, I have tried this. (without a success)

import pandas as pd
import glob
import xlrd

file_list = glob.glob("*.xls")

df = pd.DataFrame()

for f in file_list:
    wb = xlrd.open_workbook(f)
    sheet = wb.sheet_by_index(0)
    name = sheet.cell_value(rowx=9, colx=2)
    city = sheet.cell_value(rowx=15, colx=2)
    df = df.append([name,city])

Desired output is pandas dataframe as this

name   city
Tom    NY
Alex   Toronto
Anne   Atlanta
...    ...

Thanks

I think you need two sets of [[]] around what is being appended. With one set of brackets, it tries to add name as a row and city as a row, rather than as columns in the same row.

import pandas as pd
import glob
import xlrd

file_list = glob.glob("*.xls")

df = pd.DataFrame()

for f in file_list:
    wb = xlrd.open_workbook(f)
    sheet = wb.sheet_by_index(0)
    name = sheet.cell_value(rowx=9, colx=2)
    city = sheet.cell_value(rowx=15, colx=2)
    df = df.append([[name,city]])

This will have columns named 0 and 1 , though (since you didn't define names in the creation of the dataframe), so a last step would be to rename those:

df = df.rename(columns={0:'name',1:'city'})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM