简体   繁体   中英

How can I read any excel file with pandas which is in folder?

I am having a folder d:/data/input where I have an excel file stored. I want to read the excel file into a dataframe with pandas WITHOUT declaring the excel filename. Is that possible?

Thanks for your help.

If it's the only Excel file in the folder, you could do something like:

from pathlib import Path
fn = list(Path("D:/data/input").glob("*.xlsx"))[0]
df = pd.read_excel(fn)

If there's more than one file there, it'll end up just picking one of them arbitrarily, so probably not ideal.

If there are multiple excel files we could get the file with the latest create time:

from pathlib import Path
def get_latest_excel(src):
    """ src : source path to target files"""
    dfs = {f.stat().st_ctime : f for f in Path(src).glob('*.xlsx')}
    return dfs[max(dfs,  key=dfs.get)]


file = get_latest_excel(r"d:/data/input")
print(file)
WindowsPath('d:/data/input/new_excel_file.xlsx')
df = pd.read_excel(file)

Whatever you have in this folder path, you can do what you request for only one Excels files or as many Excels files you want by this way bellow, not the most elegant, for sure, but it work and it is flexible:

import pandas as pd
from os import listdir
from os.path import isfile, join

sourcePath = r"d:/data/input"
extensionFile = r".xlsx"

# Get all files in the path
fileNameList = [f for f in listdir(sourcePath) if isfile(join(sourcePath, f))]
# Removing all non excel (.xlsx) files from the list
fileNameList = [x for x in fileNameList if extensionFile in x]


# For accessing only one "first" file
print(fileNameList[0], "is loaded!") # Print the name as reference  if you want to
df = pd.read_excel(sourcePath + r"/" +fileNameList[0])

# For loading all files into a list or dict
dataframeList = []
for fn in fileNameList:
    dataframeList.append([pd.read_excel(sourcePath + r"/" + fn)])

# To access each data frame 
fileNameList[0] # Print the name as reference if you want
dataframeList[0]
# call the seconde
fileNameList[1] # Print the name as reference if you want
dataframeList[1]
# call the third
fileNameList[2] # Print the name as reference if you want
dataframeList[2]
# call the ...
fileNameList[...]
dataframeList[...]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM