简体   繁体   中英

Vectorize or optimize an loop where each iteration depends on the state of the previous iteration

I have an algorithm which I am implementing in python. The algorithm might be executed 1.000.000 times so I want to optimize it as much as possible. The base in the algorithm is three lists ( energy , point and valList ) and two counters p and e .

The two lists energy and point contains number between 0 and 1 on which I base decisions. p is a point-counter and e is an energy-counter. I can trade points for enery, and the cost of each energy is defined in valList (it is time dependent). I can also trade the otherway. But I have to trade all at once.

The outline of the algorithm:

  1. Get a boolean-list where the elements in energy is above a threshold and the elements in point is below another threshold. This is a decision to trade energy for points. Get a corresponding list for point, which gives decision to trade points for energy
  2. In each of the boolean-lists. Remove all true-values that comes after another true value (if i have trade all points for energy, i am not allowed to do that again point after)
  3. For each item-pair ( pB , point bool and eB , energy bool) from the two boolean lists: If pB is true and i have points, i want to trade all my points for enery. If eB is true and i have energy, i want to trade all my energy to points.

This is the implementation i have come up with:

start = time.time()
import numpy as np

np.random.seed(2) #Seed for deterministic result, just for debugging

topLimit = 0.55
bottomLimit = 0.45

#Generate three random arrays, will not be random in the real world
res = np.random.rand(500,3) #Will probably not be much longer than 500
energy = res[:,0]        
point = res[:,1]
valList = res[:,2]

#Step 1:
#Generate two bools that (for ex. energy) is true when energy is above a threashold
#and point below another threshold). The opposite applies to point
energyListBool = ((energy > topLimit) & (point < bottomLimit))
pointListBool = ((point > topLimit) & (energy < bottomLimit))

#Step 2:
#Remove all 'true' that comes after another true since this is not valid
energyListBool[1:] &= energyListBool[1:] ^ energyListBool[:-1]
pointListBool[1:] &= pointListBool[1:] ^ pointListBool[:-1]

p = 100
e = 0

#Step 3:
#Loop through the lists, if point is true, I loose all p but gain p/valList[i] for e
#If energy is true I loose all e but gain valList[i]*e for p
for i in range(len(energyListBool)):
    if pointListBool[i] and e == 0:
        e = p/valList[i] #Trade all points to energy
        p = 0
    elif energyListBool[i] and p == 0:
        p = valList[i]*e #Trade all enery to points
        e = 0

print('p = {0} (correct for seed 2: 3.1108006690739174)'.format(p))
print('e = {0} (correct for seed 2: 0)'.format(e))

end = time.time()
print(end - start)

What I am struggeling with is how (if it can be done) to vectorize the for-loop, so i can use that instead of the for-loop which in my mind probably would be faster.

Within the current problem setting that's not possible since vectorization essentially requires that your n -th computation step shouldn't depend on previous n-1 steps. Sometimes, however, it's possible to find so-called "closed form" of a recurrence f(n) = F(f(n-1), f(n-2), ... f(nk)) , ie to find an explicit expression for f(n) that doesn't depend on n , but it's a separate research problem.

Moreover, from algorithmic point of view such a vectorization wouldn't give a lot, since complexity of your algorithm would still be C*n = O(n) . However, since "complexity constant" C does matter in practice, there are different ways to reduce it. For example, it shouldn't be a big problem to rewrite your critical loop in C/C++.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM