[英]Python: loop through excel sheets and write to csv
I have a very large dataset (>100gb).我有一个非常大的数据集(> 100gb)。 It has many excel files (.xlsx).它有许多 excel 文件 (.xlsx)。 Each xlsx file has many sheets.每个 xlsx 文件都有许多工作表。 The data in each sheet is shown in the below picture.每张表中的数据如下图所示。
I would like to combine these sheets into a csv file, and change this wide format to a long format so that:我想将这些表格组合成一个 csv 文件,并将这种宽格式更改为长格式,以便:
What would be the most effective way to do this?最有效的方法是什么? I have the code to loop through files and sheets, but cannot transpose the (wide) data to the long format that I am after.我有循环文件和工作表的代码,但无法将(宽)数据转换为我所追求的长格式。 Below is my attempt to loop:下面是我尝试循环:
import csv
from os import listdir
from os.path import isfile, join
mypath = "E:/data_download/Python_test_files/"
file_lists = [f for f in listdir(mypath) if isfile(join(mypath, f))]
import xlrd
for file in file_lists:
book = xlrd.open_workbook(f'{mypath}{file}')
sheet_names = book.sheet_names()
print(sheet_names)
for sheet in book.sheets():
for row in sheet.get_rows():
taking things step by step (and take in mind that in order for the process to be as fast as possible, you have to use native python as much as you can and only use the other libraries when you absolutely HAVE TO.): so you want one csv file out of all those sheets.循序渐进(并记住,为了使过程尽可能快,您必须尽可能多地使用本机 python,并且仅在绝对必须时才使用其他库。):所以你想要从所有这些表格中取出一个 csv 文件。 what you should do is first make a 2D list
of all the rows from all the sheets, and however you want them constructed like you've mentioned, that you want to include in the csv file, and then finally import them into the csv file with the Dataframe
class using the pandas library:你应该做的是首先制作所有工作表中所有行的二维list
,但是你希望它们像你提到的那样构建,你想包含在 csv 文件中,然后最后将它们导入 csv 文件使用Dataframe
class 使用 pandas 库:
import pandas as pd
my_list = [...] # your 2D list containing the rows
dataset= pd.DataFrame(my_list, columns=['column1','column2', '...') # the name of your columns
dataset.to_csv('/PATH/file.csv')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.