我如何重构我的 python 代码以降低时间复杂度

Question

this code takes 9 sec which is very long time, i guess the problem in 2 loop in my code此代码需要 9 秒，这是很长的时间，我猜我的代码中的 2 个循环中存在问题


for symptom in symptoms:
  # check if the symptom is mentioned in the user text
  norm_symptom = symptom.replace("_"," ")
  for combin in list_of_combinations:
    print(getSimilarity([combin, norm_symptom]))
    if getSimilarity([combin,norm_symptom])>0.25:
       if symptom not in extracted_symptoms:
        extracted_symptoms.append(symptom)

i tried to use zip like this:我试图像这样使用 zip：

for symptom, combin in zip(symptoms,list_of_combinations):
  norm_symptom = symptom.replace("_"," ")
  if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
    extracted_symptoms.append(symptom)

Answer 1

Use a dict to store the results of getSimilarity for each combination and symptom.使用字典存储每个组合和症状的 getSimilarity 结果。 This way, you can avoid calling getSimilarity multiple times for the same combination and symptom.这样，您可以避免为相同的组合和症状多次调用 getSimilarity。 This way it will be more efficient, thus faster.这样它会更有效率，从而更快。

import collections

similarity_results = collections.defaultdict(dict)

for symptom in symptoms:
   norm_symptom = symptom.replace("_"," ")
   for combin in list_of_combinations:
       # Check if the similarity has already been computed
       if combin in similarity_results[symptom]:
          similarity = similarity_results[symptom][combin]
       else:
          similarity = getSimilarity([combin, norm_symptom])
          similarity_results[symptom][combin] = similarity
   if similarity > 0.25:
       if symptom not in extracted_symptoms:
           extracted_symptoms.append(symptom)

Answer 2

Indeed, you're algorithm is slow because of the 2 nested loops.事实上，由于 2 个嵌套循环，你的算法很慢。

It performs with big ON*M (see more here https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ )它以大 ON*M 执行（在此处查看更多信息https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ ）

N being the lenght of symptoms and M being the list_of_combinations N 是症状的长度，M 是 list_of_combinations

What can takes time also is the computation getSimilarity , what is this operation?会花时间的也是计算getSimilarity ，这个操作是什么？

我如何重构我的 python 代码以降低时间复杂度

问题描述

2 个解决方案

解决方案1
0 2022-12-21 11:26:26

解决方案2
0 2022-12-21 11:28:33

我如何重构我的 python 代码以降低时间复杂度

问题描述

2 个解决方案

解决方案1 0 2022-12-21 11:26:26

解决方案2 0 2022-12-21 11:28:33

解决方案1
0 2022-12-21 11:26:26

解决方案2
0 2022-12-21 11:28:33