我如何重构我的 python 代码以降低时间复杂度

Question

此代码需要 9 秒，这是很长的时间，我猜我的代码中的 2 个循环中存在问题


for symptom in symptoms:
  # check if the symptom is mentioned in the user text
  norm_symptom = symptom.replace("_"," ")
  for combin in list_of_combinations:
    print(getSimilarity([combin, norm_symptom]))
    if getSimilarity([combin,norm_symptom])>0.25:
       if symptom not in extracted_symptoms:
        extracted_symptoms.append(symptom)

我试图像这样使用 zip：

for symptom, combin in zip(symptoms,list_of_combinations):
  norm_symptom = symptom.replace("_"," ")
  if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
    extracted_symptoms.append(symptom)

Answer 1

使用字典存储每个组合和症状的 getSimilarity 结果。 这样，您可以避免为相同的组合和症状多次调用 getSimilarity。 这样它会更有效率，从而更快。

import collections

similarity_results = collections.defaultdict(dict)

for symptom in symptoms:
   norm_symptom = symptom.replace("_"," ")
   for combin in list_of_combinations:
       # Check if the similarity has already been computed
       if combin in similarity_results[symptom]:
          similarity = similarity_results[symptom][combin]
       else:
          similarity = getSimilarity([combin, norm_symptom])
          similarity_results[symptom][combin] = similarity
   if similarity > 0.25:
       if symptom not in extracted_symptoms:
           extracted_symptoms.append(symptom)

Answer 2

事实上，由于 2 个嵌套循环，你的算法很慢。

它以大 ON*M 执行（在此处查看更多信息https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ ）

N 是症状的长度，M 是 list_of_combinations

会花时间的也是计算getSimilarity ，这个操作是什么？

我如何重构我的 python 代码以降低时间复杂度

问题描述

2 个解决方案

解决方案1
0 2022-12-21 11:26:26

解决方案2
0 2022-12-21 11:28:33

我如何重构我的 python 代码以降低时间复杂度

问题描述

2 个解决方案

解决方案1 0 2022-12-21 11:26:26

解决方案2 0 2022-12-21 11:28:33

解决方案1
0 2022-12-21 11:26:26

解决方案2
0 2022-12-21 11:28:33