简体   繁体   中英

how can i refactor my python code to decrease the time complexity

this code takes 9 sec which is very long time, i guess the problem in 2 loop in my code


for symptom in symptoms:
  # check if the symptom is mentioned in the user text
  norm_symptom = symptom.replace("_"," ")
  for combin in list_of_combinations:
    print(getSimilarity([combin, norm_symptom]))
    if getSimilarity([combin,norm_symptom])>0.25:
       if symptom not in extracted_symptoms:
        extracted_symptoms.append(symptom)

i tried to use zip like this:

for symptom, combin in zip(symptoms,list_of_combinations):
  norm_symptom = symptom.replace("_"," ")
  if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
    extracted_symptoms.append(symptom)

Use a dict to store the results of getSimilarity for each combination and symptom. This way, you can avoid calling getSimilarity multiple times for the same combination and symptom. This way it will be more efficient, thus faster.

import collections

similarity_results = collections.defaultdict(dict)

for symptom in symptoms:
   norm_symptom = symptom.replace("_"," ")
   for combin in list_of_combinations:
       # Check if the similarity has already been computed
       if combin in similarity_results[symptom]:
          similarity = similarity_results[symptom][combin]
       else:
          similarity = getSimilarity([combin, norm_symptom])
          similarity_results[symptom][combin] = similarity
   if similarity > 0.25:
       if symptom not in extracted_symptoms:
           extracted_symptoms.append(symptom)

Indeed, you're algorithm is slow because of the 2 nested loops.

It performs with big ON*M (see more here https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ )

N being the lenght of symptoms and M being the list_of_combinations

What can takes time also is the computation getSimilarity , what is this operation?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM