简体   繁体   English

如何提高使用 python 读取多个 excel 文件的运行时间?

[英]How can I improve the runtime of reading multiple excel files using python?

I created function that iterates over a folder containing excel files and creates a list of all the headers across all sheets.我创建了 function 迭代包含 excel 文件的文件夹并创建所有工作表中所有标题的列表。 I t works fine but is VERY slow .它工作正常,但非常慢 Do you have any ideas on how to improve it?您对如何改进它有任何想法吗? THANKS!谢谢!

import glob

# file directory
path = r'C:\Users\John\Excel_folder' 
all_files = glob.glob(path + "/*.xlsx")

def get_columns(file):    
    sheets = pd.ExcelFile(file).sheet_names
    for sheet in sheets:
        for i in (list(pd.read_excel(file, sheet, nrows=0).columns)):
                  col.append(i)
col=[]
for i in all_files:
    get_columns(i)

col

you can pass None to sheet_name in read_excel to read all sheets at once.您可以将None传递给sheet_name中的read_excel以一次读取所有工作表。 It creates a dictionary of dataframe, so at the end you can do with list comprehension.它创建了一个 dataframe 的字典,所以最后你可以使用列表理解。

def get_columns(file):
    return [c 
            for df in pd.read_excel(file, 
                                    sheet_name=None, 
                                    nrows=0).values() 
            for c in df.columns]

col = [c for file in all_files for c in get_columns(file)]

it should be faster because you open once the file instead of many times.它应该更快,因为您打开文件一次而不是多次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我可以通过编译来改善python运行时吗? - Can I improve python runtime by compiling? 如何使用 pandas 编写 python 脚本来迭代具有多张工作表的 Excel.xlsx 文件? - How can I write a python scripts using pandas to iterate over Excel .xlsx files with multiple sheets? 我怎样才能提高它的运行时间 - How can I improve its runtime 读取多个Excel文件并将其写入python中的多个Excel文件 - Reading multiple excel files and writting it to multiple excel files in python 使用Python中的列表搜索大文件-如何提高速度? - Seaching big files using list in Python - How can improve the speed? 如何使用python将多个Excel文件中的数据合并到一个Excel文件中? - How can I use python to combine data from multiple excel files into one excel file? 使用循环读取并添加多个Excel文件 - Reading multiple excel files using a loop and append 如何提高课程安排问题中循环检测的 Python 实现的运行时间? - How can I improve the runtime of Python implementation of cycle detection in Course Schedule problem? 如何提高我的网页抓取脚本(Python 和 Selenium)的性能(运行时) - How can I improve performance (runtime) on my webscraping script (Python and Selenium) 使用Python读取多个文件时,如何搜索错误字符串的重复出现? - While reading multiple files with Python, how can I search for the recurrence of an error string?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM