简体   繁体   English

如何使此python(使用openpyxl)程序运行更快?

[英]How can I make this python(using openpyxl) program run faster?

Here is my code: 这是我的代码:

import openpyxl
import os

os.chdir('c:\\users\\Desktop')
wb= openpyxl.load_workbook(filename= 'excel.xlsx',data_only = True)
wb.create_sheet(index=0,title='Summary')
sumsheet= wb.get_sheet_by_name('Summary')
print('Creating Summary Sheet')
#loop through worksheets
print('Looping Worksheets')
for sheet in wb.worksheets:
    for row in  sheet.iter_rows():
        for cell in row:
                 #find headers of columns needed
                if cell.value=='LowLimit':
                     lowCol=cell.column
                if cell.value=='HighLimit':
                     highCol=cell.column
                if cell.value=='MeasValue':
                     measCol=cell.column

                 #name new columns    
                sheet['O1']='meas-low'
                sheet['P1']='high-meas'
                sheet['Q1']='Minimum'
                sheet['R1']='Margin'

                 #find how many rows of each sheet
                maxrow=sheet.max_row
                i=0

                #subtraction using max row
                for i in range(2,maxrow+1):
                      if  sheet[str(highCol)+str(i)].value=='---':
                          sheet['O'+str(i)]='='+str(measCol)+str(i)+'-'+str(lowCol)+str(i)
                          sheet['P'+str(i)]='=9999'
                          sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
                          sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'
                      elif sheet[str(lowCol)+str(i)].value=='---':
                          sheet['O'+str(i)]='=9999'
                          sheet['P'+str(i)]='='+str(highCol)+str(i)+'-'+str(measCol)+str(i)
                          sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
                          sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'
                      else:
                          sheet['O'+str(i)]='='+str(measCol)+str(i)+'-'+str(lowCol)+str(i)
                          sheet['P'+str(i)]='='+str(highCol)+str(i)+'-'+str(measCol)+str(i)
                          sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
                          sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'

                ++i


print('Saving new wb')
import os 
os.chdir('C:\\Users\\hpj683\\Desktop')
wb.save('example.xlsx')

This runs perfectly fine except that it takes 4 minutes to complete one excel workbook. 这运行得非常好,只是需要4分钟才能完成一本excel工作簿。 Is there any way I can optimize my code to make this run faster? 有什么方法可以优化代码以使其运行更快? My research online suggested to change to read_only or write_only to make it run faster however my code requires reading and writing to an excel workbook, so neither of those worked. 我的在线研究建议更改为read_only或write_only以使其运行更快,但是我的代码需要读写excel工作簿,因此两者均无效。

The code could benefit from being broken down into separate functions. 代码可以从分解成单独的功能中受益。 This will help you identify the slow bits and replace them bit by bit. 这将帮助您识别慢速位并逐步替换它们。

The following bits should not be in the loop for every row: 以下位不应该在每一行的循环中:

  • finding the headers 查找标题
  • calling ws.max_row this is very expensive 调用ws.max_row这非常昂贵
  • ws["C" + str(i)] . ws["C" + str(i)] Use ws.cell(row=i, column=3) 使用ws.cell(row=i, column=3)

And if the nested loop is not a formatting error then why is it nested? 如果嵌套循环不是格式错误,那么为什么要嵌套呢?

Also you should look at the profile module to find out what is slow. 另外,您还应该查看配置文件模块,以找出速度较慢的模块。 You might want to watch my talk on profiling openpyxl from last year's PyCon UK. 您可能想看一下我关于去年PyCon UK 上的openpyxl配置文件的演讲

Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM