Python: Improve the speed of Euclidean distance calculation in a class

Question

I have a class component that calculates the Euclidean distance between the last elements in arrays within 2 dictionaries. One dictionary contents the tracking trajectory of blobs ( r ), and the other have the updated values of the blobs( b ). The methods of the class find the appeared or disappeared trajectories based on Euclidean distance. Finally, they reorder the r dictionary on their best match with the b dictionary.

I tested the functionality in this colab notebook and it works as expected, but when I implement it on my code the program gets slow.

Is there I way I can improve the speed of this class?
Is there a better approach to solve this problem? What is it?

Thank you.

from scipy.spatial import distance as dist

class finder:

    def disappeared(self,r,b):
        values = {}
        indexes = {}
        diss = {}
        new_results = {}
        new_positions = {}

        le = len(r) - len(b)       

        for i in r:
            xr = r[i]["x"][-1]
            yr = r[i]["y"][-1]
            
            for k in b:
                xb = b[k]["x"][-1]
                yb = b[k]["y"][-1]
              
                D = dist.cdist([(xb,yb)],[(xr,yr)])
               
                values[str(i) +"/" + str(k)] = D
                indexes[str(i) +"/" + str(k)] = (i,k)

            if le > 0:
                le -= 1
                  
                maxval = max(values,key=values.get)
        
                r_ind = indexes[maxval][0]
                b_ind = indexes[maxval][1]

                print("Found Disappeared", maxval) 
  
                diss[r_ind] = r[r_ind]
            
            else:
                minval = min(values,key=values.get)
                r_ind = indexes[minval][0]
                b_ind = indexes[minval][1]
                new_positions[b_ind] = r[r_ind]
                
                del values[minval]
         
        for m,n in enumerate(new_positions):
            new_results[m] = new_positions[n]

        return(new_results,diss)

    def appeared(self,r,b):
        values = {}
        indexes = {}
        appr = {}
        new_results = {}
        new_positions = {}

        le = len(b) - len(r)       

        for i in b:

            xb = b[i]["x"][-1]
            yb = b[i]["y"][-1]

            for k in r:

                xr = r[k]["x"][-1]
                yr = r[k]["y"][-1]
              
                D = dist.cdist([(xr,yr)],[(xb,yb)])
               
                values[str(k) +"/" + str(i)] = D
                indexes[str(k) +"/" + str(i)] = (k,i)

            if le > 0:
                le -= 1
                  
                maxval = max(values,key=values.get)
        
                r_ind = indexes[maxval][0]
                b_ind = indexes[maxval][1]

                print("Found Appeared", maxval) 
  
                appr[b_ind] = b[b_ind]
                new_positions[r_ind] = b[b_ind]
            
            else:
                minval = min(values,key=values.get)
                r_ind = indexes[minval][0]
                b_ind = indexes[minval][1]
                new_positions[b_ind] = r[r_ind]
                
                del values[minval]
         
        for m,n in enumerate(new_positions):
            new_results[m] = new_positions[n]

        return(new_results)

Answer 1

Most of the time is probably spent accessing dictionaries and formatting strings.

Here are a few things you could do to optimize disappeared() :

Access values of b only once:

 # at start of function ...
 lastB = [ (k,v["x"][-1],v["y"][-1]) for k,v in b.items() ]

 ...

 for k,xb,yb in lastB:  # replaces for k in b: and the assignments of xb,yb
     
     ...

Obtain values along with keys when accessing r :

 for i,v in r.items():
     xr = v["x"][-1]
     yr = v["y"][-1]

Use tuples instead of strings for values and you won't need indexes at all:

 # index it with a tuple
 values[(k,i)]  = D
 
 ...

 # replace the whole maxval logic.
 r_ind,b_ind,_ = max(values.items(),key=lambda kv:kv[1])     

 ...

 # replace the whole minval logic.
 r_ind,b_ind,_ = min(values.items(),key=lambda kv:kv[1])
 ...
 del values[r_ind,b_ind]

Generate new result without re-accessing each key:

 new_result = dict(enumerate(new_positions.values()))

The same improvements can be made to appeared() , as it is almost identical.

Answer 2

Does this code actually work? These lines look totally wrong:

       for i in r:
            xr = r[i]["x"][-1]
            yr = r[i]["y"][-1]

i is an element of r here. You wouldn't use that as an index into r . Surely it's supposed to be:

       for i in r:
            xr = i["x"][-1]
            yr = i["y"][-1]

and the same for the for k in b loop.

Python: Improve the speed of Euclidean distance calculation in a class

Question

2 answers

solution1
1 ACCPTED 2021-03-01 20:47:44

solution2
-1 2021-02-26 07:02:07

Python: Improve the speed of Euclidean distance calculation in a class

Question

2 answers

solution1 1 ACCPTED 2021-03-01 20:47:44

solution2 -1 2021-02-26 07:02:07

solution1
1 ACCPTED 2021-03-01 20:47:44

solution2
-1 2021-02-26 07:02:07