Why is my script progressively slows down?

Question

I am doing my thesis work, for which I have to make a "unit converter". It reads a long output file, fills these information into arrays, and then converting from descartes coordinates to celestial orbital elements. My current output file to convert is 1.2 million lines long, that means about ~137000 measurement points.

The main function looks like this:

def desc_to_orb(file,n):
k = 1#4*pi**2
params, res, h_in, h_out, desc, R_in, R_out, V_in, V_out, c_in, c_out, C_out, C_in = ([], [], [], [], [], [], [], [], [], [], [], [], [])
a_in, a_out, e_in, e_out, inc_in, inc_out, Omega_in, Omega_out, omega_in, omega_out, Lambda_in, Lambda_out = ([], [], [], [], [], [], [], [], [], [], [], []) #orbital elements
body_array = read(file,n)
for i in range(len(body_array)):
    params.append(body_param(body_array[i]))

for i in range(0,n): #nth body loop
    desc.append([])
    for j in tqdm(range(0,int((line(output)-1)/(5+n))), desc="Appending elements..."): #time loop
        desc[i].append(body(params,i,j)) #[n][t]

for i in range(0,2):
    for j in tqdm(range(0,int((line(output)-1)/(5+n))), desc="Converting to orbital elements..."):
        R_in.append(np.linalg.norm(np.subtract(desc[1][j].r,desc[2][j].r)))
        R_out.append(np.linalg.norm(np.subtract(np.subtract(desc[1][j].r,desc[2][j].r),desc[0][j].r)))
        V_in.append(np.linalg.norm(np.subtract(desc[1][j].v,desc[2][j].v)))
        V_out.append(np.linalg.norm(np.subtract(np.subtract(desc[1][j].v,desc[2][j].v),desc[0][j].v)))
        c_in.append(np.cross(np.subtract(desc[1][j].r,desc[2][j].r), np.subtract(desc[1][j].v,desc[2][j].v))) #[n][t][x,y,z]
        c_out.append(np.cross(np.subtract(np.subtract(desc[1][j].r,desc[2][j].r),desc[0][j].r), np.subtract(np.subtract(desc[1][j].v,desc[2][j].v),desc[0][j].v))) #[n][t][x,y,z]
        C_in.append(np.linalg.norm(c_in)) #[n][t]
        C_out.append(np.linalg.norm(c_out)) #[n][t]
        
        inc_in.append(np.arctan((sqrt(c_in[j][0]**2+c_in[j][1]**2))/(c_in[j][2])))
        inc_out.append(np.arctan((sqrt(c_out[j][0]**2+c_out[j][1]**2))/(c_out[j][2])))
        Omega_in.append(np.arctan(-c_in[j][0]/c_in[j][1]))
        Omega_out.append(np.arctan(-c_out[j][0]/c_out[j][1]))
        h_in.append(0.5*V_in[j]**2-(k**2*((desc[1][j].m+desc[2][j].m)/(R_in[j]))))
        a_in.append(-k**2*((desc[1][j].m+desc[2][j].m)/(2*h_in[j])))
        h_out.append(0.5*V_out[j]**2-(k**2*((((desc[1][j].m*a_in[j] + desc[2][j].m*a_in[j]) / (desc[1][j].m + desc[2][j].m))+desc[0][j].m)/R_out[j])))
        a_out.append(-k**2*((((desc[1][j].m*a_in[j] + desc[2][j].m*a_in[j]) / (desc[1][j].m + desc[2][j].m))+desc[0][j].m)/(2*h_out[j])))
        Lambda_in.append(((-k**2*(desc[1][j].m+desc[2][j].m))/(R_in[j]))*(desc[1][j].r-desc[2][j].r)+np.cross(desc[1][j].v-desc[2][j].v,c_in[j]))
        Lambda_out.append(-((k**2*(((desc[1][j].m*a_in[j]+desc[2][j].m*a_in[j])/(desc[1][j].m+desc[2][j].m))+desc[0][j].m))/(R_out[j]))*(np.subtract(np.subtract(desc[1][j].r,desc[2][j].r),desc[0][j].r))+np.cross((np.subtract(np.subtract(desc[1][j].v,desc[2][j].v),desc[0][j].v)),c_out[j]))
        e_in.append((np.linalg.norm(Lambda_in[j]))/(k**2*(desc[1][j].m+desc[2][j].m)))
        e_out.append((np.linalg.norm(Lambda_out[j]))/(k**2*(((desc[1][j].m*a_in[j] + desc[2][j].m*a_in[j]) / (desc[1][j].m + desc[2][j].m))+desc[0][j].m)))
        
        if np.cos(inc_in[j]) != 0:
            omega_in.append(np.arctan( (C_in[j]*Lambda_in[j][2]) / (c_in[j][0]*Lambda_in[j][1]+c_in[j][1]*Lambda_in[j][0]) ))
        elif np.cos(inc_in[j]) == 0:
            omega_in.append(np.arctan( (Lambda_in[j][2]/Lambda_in[j][1])*np.sin(Omega_in[j]) ))
        if np.cos(inc_out[j]) != 0:
            omega_out.append(np.arctan( (C_out[j]*Lambda_out[j][2]) / (c_out[j][0]*Lambda_out[j][1]+c_out[j][1]*Lambda_out[j][0]) ))
        elif np.cos(inc_out[j]) == 0:
            omega_out.append(np.arctan( (Lambda_out[j][2]/Lambda_out[j][1])*np.sin(Omega_out[j]) ))
                
return [a_in,e_in,inc_in,Omega_in,omega_in],[a_out,e_out,inc_out,Omega_out,omega_out]

It works as it supposed to be, but with a large file like this, it's really slow. I added a progress bar to see how it is going, the input file reading is a few seconds, but conversion (the second for double loop) takes a lot of time.

Appending elements...: 100%|██████████| 137863/137863 [00:00<00:00, 193583.82it/s]
Appending elements...: 100%|██████████| 137863/137863 [00:00<00:00, 204194.98it/s]
Appending elements...: 100%|██████████| 137863/137863 [00:00<00:00, 188552.53it/s]
Converting to orbital elements...:  49%|████▉     | 67259/137863 [47:28<1:33:29, 12.59it/s]

It started with a few hundred iterations per second, and it progressively slows down, now its down to ~12 it/s, and it has still much to do. What am I doing wrong? Why is it slowing down?

Thanks.

Answer 1

Numpy has the ability to cast operations and arrays onto a numpy array, which is typically done at the C code level and is very fast and efficient. I would read through a guide or two, here's one I found, https://numpy.org/doc/stable/user/quickstart.html . Take for example, the attempt to find the modulus of a 1000, 100-long vectors.

import numpy as np

#EXPLICIT
#%%timeit - 89.7 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
X = np.random.random((1000,100)) #1000, 100-long vectors
M = np.empty((1000,))            #Where we will save each modulus
for i,x in enumerate(X):
    m = 0
    for j in x:
        m += j**2
    M[i] = np.sqrt(m)

#CASTING
#%%timeit - 1.34 ms ± 8.16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
X = np.random.random((1000,100))
M = np.sqrt((X**2).sum(1))

About 6700 times faster (so instead of 6700 computers, you just need one). You see when we squared the array X, numpy assumes you meant you wanted every entry squared, so it cast the operation to all the entries in X in a highly optimized way. Likewise, we wanted to add up all the entries along one axis, hence we said to.sum() and specified the axis to sum along, again numpy does this in a highly optimized way. Next we want to take the sqrt of every entry, so we wrap the whole array in np.sqrt and again numpy assumes you meant that you want that operation cast to every entry in the array.

This can get infinitely more complicated. In particular, numpy will adjust how it casts arrays and operations together based on the context. For example,

Here we add a scalar to an array, again numpy assumes we wanted every entry to be added to the scalar.

X = np.array([[1,2],[3,4]])
print(X+1)

Here we add a 2x2 and a 2x2, hence numpy assumes we wanted to add each entry together because the arrays match in size.

X = np.array([[1,2],[3,4]])
Y = np.array([[1,2],[3,4]])
print(X+Y)

Here we add a 1x3 and a 3 long vector. Because the arrays do not match, numpy assumes you wanted each entry to be added to the 3 long vector. So for example, it attempts to add [1] to the three long and then [2] to the three long, etc. Again, because the [1] and the 3 long don't match numpy assumes you wanted each entry in the 3 long to be added to the number in the entry. This results in a 3x3 answer.

X = np.array([[1],[2],[3]])
Y = np.array([4,5,6])
print(X+Y)

Suppose I want sum of the absolute values of the diagonal in a large matrix:

X = np.random.random((1000,1000)) - 0.5
print(np.abs(X*np.identity(1000)).sum())

We could keep going all day. In general, if you want to use numpy you should be avoiding "for i in X" statements as much as possible. You want to have NO for loops (if you can help it, only use a loop if it is absolutely required) and instead let numpy do all the work in optimizing the casting of the operations. It might take forever to figure out the exact way to get the code right, but in general, numpy should look clean. Here is an example of some code performing a gamma pass in a markov model I wrote, each line took awhile to figure out, but in the end no for loops:

pi = g[0]
A  = di_g.sum(0)/g[:-1,:].sum(0)[:,None]
B  = (g[:,:,None]*O_m).sum(0)
B  = B/B.sum(1)[:,None]

Why is my script progressively slows down?

Question

1 answers

solution1
1 2021-02-13 20:18:59

Why is my script progressively slows down?

Question

1 answers

solution1 1 2021-02-13 20:18:59

solution1
1 2021-02-13 20:18:59