简体   繁体   English

Python:如何优化此代码

[英]Python: how to optimize this code

I tried to optimize the code below but I cannot figure out how to improve computation speed. 我尝试优化下面的代码,但无法弄清楚如何提高计算速度。 Below code is taking almost 30 secs to run. 下面的代码需要将近30秒才能运行。 this is taking time because of bootsam and filedata matrix. 由于bootsam和filedata矩阵,这很花时间。 Can someone please help me to optimize this code Is it possible to improve the performance? 有人可以帮我优化此代码吗?可以提高性能吗?

import numpy as np
filedata=np.genfromtxt('monthlydata1970to2010.txt',dtype='str') # this will creae 980 * 7 matrix
nboot=5000  
results=np.zeros((11,nboot));   #this will create 11*5000 matrix  
results[0,:]=600  
horizon=360  
balance=200  
bootsam=np.random.randint(984, size=(984, nboot)) # this will create 984*5000 matrix
for bs in range(0,nboot):  
   for mn in range(1,horizon+1):  
        if mn%12 ==1:  
            bondbal = 24*balance  
            sp500bal=34*balance  
            russbal = 44*balance  
            eafebal=55*balance  
            cashbal =66*balance  
            bondbal=bondbal*(1+float(filedata[bootsam[mn-1,bs]-1,2]))  
            sp500bal=sp500bal*(1+float(filedata[bootsam[mn-1,bs]-1,3]))  
            russbal=russbal*(1+float(filedata[bootsam[mn-1,bs]-1,4]))  
            eafebal=eafebal*(1+float(filedata[bootsam[mn-1,bs]-1,5]))  
            cashbal=cashbal*(1+float(filedata[bootsam[mn-1,bs]-1,6]))  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal  
        else:  
            bondbal=bondbal*(1+float(filedata[bootsam[mn-1,bs]-1,2]))
            sp500bal=sp500bal*(1+float(filedata[bootsam[mn-1,bs]-1,3]))
            russbal=russbal*(1+float(filedata[bootsam[mn-1,bs]-1,4]))
            eafebal=eafebal*(1+float(filedata[bootsam[mn-1,bs]-1,5]))
            cashbal=cashbal*(1+float(filedata[bootsam[mn-1,bs]-1,6]))
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
            if mn == 60:
               results[1,bs]=balance
            if mn == 120: 
               results[2,bs]=balance
            if mn == 180:
               results[3,bs]=balance
            if mn == 240:
               results[4,bs]=balance
            if mn == 300: 
               results[5,bs]=balance  

Basic Algebra: executing x = x * 1.23 360 times can be easily converted to a single execution of 基本代数:执行x = x * 1.23可轻松将360次转换为单次执行

x = x * (1.23 ** 360)

Refactor your code and you'll see that the loops are not really needed. 重构代码,您会发现并不需要真正的循环。

It is difficult to answer without seeing the real code. 不看真实的代码就很难回答。 I can't get your sample working because balance is set to inf early in the code, as it has been noticed in the comments to the question. 我无法使您的示例正常工作,因为在代码的早期, balance已设置为inf ,正如在问题注释中所注意到的那样。 Anyway a pretty obvious optimization is not to read the bootsam[mn-1,bs] element five times at every iteration in order to compute the xxbal variables. 无论如何,一个非常明显的优化是在每次迭代时都不读取bootsam[mn-1,bs]元素五次以计算xxbal变量。 All those variables use the same bootsam element so you should read the element once and reuse it: 所有这些变量都使用相同的bootsam元素,因此您应该阅读一次该元素并重新使用它:

for bs in xrange(0,nboot):
   for mn in xrange(1,horizon+1):
        row = bootsam[mn-1,bs]-1
        if (mn % 12) == 1:  
            bondbal = 24*balance
            sp500bal=34*balance
            russbal = 44*balance
            eafebal=55*balance
            cashbal =66*balance

            bondbal=bondbal*(1+float(filedata[row,2]))  
            sp500bal=sp500bal*(1+float(filedata[row,3]))  
            russbal=russbal*(1+float(filedata[row,4]))  
            eafebal=eafebal*(1+float(filedata[row,5]))  
            cashbal=cashbal*(1+float(filedata[row,6]))  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
        else:  
            bondbal=bondbal*(1+float(filedata[row,2]))  
            sp500bal=sp500bal*(1+float(filedata[row,3]))  
            russbal=russbal*(1+float(filedata[row,4]))  
            eafebal=eafebal*(1+float(filedata[row,5]))  
            cashbal=cashbal*(1+float(filedata[row,6]))  

The optimized code (which uses a fake value for balance ) runs nearly twice faster than the original one on my old Acer Aspire. 经过优化的代码(使用伪造的balance值)运行速度比我的旧Acer Aspire上的原始代码快近两倍。

Update 更新资料

If you need further optimizations you can do at least two more things: 如果您需要进一步的优化,您至少可以做两件事:

  • do not add 1 and convert to float at every accessed element of filedata . 不加1,并转换为浮动在每一个访问的元素filedata Instead add 1 to the array at creation time and give it a float datatype. 而是在创建时向数组添加1并为其赋予float数据类型。
  • do not use arithmetic expressions that mix numpy and built-in numbers because Python arithmetic works slower (you can read more on this problem in this SO thread ) 不要使用混合了numpy和内置数字的算术表达式,因为Python算术的运行速度较慢(您可以在此SO线程中阅读有关此问题的更多信息)

The following code follows those advices: 以下代码遵循这些建议:

filedata=np.genfromtxt('monthlydata1970to2010.txt',dtype='str') # this will creae 980 * 7 matrix
my_list = (np.float(1) + filedata.astype(np.float)).tolist() # np.float is converted to Python float
nboot=5000
results=np.zeros((11,nboot))   #this will create 11*5000 matrix
results[0,:]=600  
horizon=360
balance=200
bootsam=np.random.randint(5, size=(984, nboot)) # this will create 984*5000 matrix
for bs in xrange(0,nboot):
   for mn in xrange(1,horizon+1):
        row = int(bootsam[mn-1,bs]-1)
        if (mn % 12) == 1:
            bondbal = 24*balance
            sp500bal=34*balance
            russbal = 44*balance
            eafebal=55*balance
            cashbal =66*balance

            bondbal=bondbal*(my_list[row][2])  
            sp500bal=sp500bal*(my_list[row][3])  
            russbal=russbal*(my_list[row][4])  
            eafebal=eafebal*(my_list[row][5])  
            cashbal=cashbal*(my_list[row][6])  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
        else:  
            bondbal=bondbal*(my_list[row][2])  
            sp500bal=sp500bal*(my_list[row][3])  
            russbal=russbal*(my_list[row][4])  
            eafebal=eafebal*(my_list[row][5])  
            cashbal=cashbal*(my_list[row][6])  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal  

With those changes the code runs nearly twice faster than the previously optimized code. 经过这些更改,代码的运行速度比以前优化的代码快了近两倍。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM