[英]How to use Pandas to update multiple excel files and save the updated files without combinig them
I have multiple excel files containing several sheets.我有多个包含几张纸的 excel 文件。 I need to remove some specific columns from a particular sheet.我需要从特定工作表中删除一些特定列。 I need to do the same for all the files.我需要对所有文件做同样的事情。 After that I need to save all those edited files without combinig them.之后,我需要保存所有那些编辑过的文件而不合并它们。 I have done the same for one file, I need a macro so that I can apply for all the files.我对一个文件做了同样的事情,我需要一个宏,以便我可以申请所有文件。 I have prepared the code for one file:我已经为一个文件准备了代码:
import os
import pandas as pd
from openpyxl import load_workbook
book = load_workbook('file1.xlsx')
sheet = book['sheet1']
#the update needed is to delete some columns
sheet.delete.cols(3,5)
book.save('file1_copy.xlsx')
as you've already written the code, all that is left is to iterate over your files and apply it one by one.由于您已经编写了代码,剩下的就是遍历您的文件并一一应用它。
I would do this as follows:我会这样做:
from openpyxl import load_workbook
from pathlib import Path
excel_files = Path('excel_file_location').glob('*.xlsx')
for file in excel_files:
book = load_workbook(file)
sheet = book['sheet1']
#the update needed is to delete some columns
sheet.delete.cols(3,5)
# save in same location with the _copy suffix attached.
# /home/file/excel_1.xlsx -> /home/file/excel_1_copy.xlsx
book.save(f"{file.parent}{file.stem}_copy{file.suffix}")
You can use a list with all the files and then loop over that list.您可以使用包含所有文件的列表,然后遍历该列表。 That will run the code in the for
loop for each item in the list.这将为列表中的每个项目运行for
循环中的代码。 You'll also have to have the code adjust the new save name.您还必须让代码调整新的保存名称。
So you'd have something like所以你会有类似的东西
from openpyxl import load_workbook
books = ['file1.xlsx', 'file2.xlsx', 'file3.xlsx',...]
for book in books:
sheet = book['sheet1']
sheet.delete.cols(3,5)
savename = book.split('.')[0] + '_copy.' + book.split('.')[-1]
book.save(savename)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.