我如何重構我的 python 代碼以降低時間復雜度

Question

此代碼需要 9 秒，這是很長的時間，我猜我的代碼中的 2 個循環中存在問題


for symptom in symptoms:
  # check if the symptom is mentioned in the user text
  norm_symptom = symptom.replace("_"," ")
  for combin in list_of_combinations:
    print(getSimilarity([combin, norm_symptom]))
    if getSimilarity([combin,norm_symptom])>0.25:
       if symptom not in extracted_symptoms:
        extracted_symptoms.append(symptom)

我試圖像這樣使用 zip：

for symptom, combin in zip(symptoms,list_of_combinations):
  norm_symptom = symptom.replace("_"," ")
  if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
    extracted_symptoms.append(symptom)

Answer 1

使用字典存儲每個組合和症狀的 getSimilarity 結果。 這樣，您可以避免為相同的組合和症狀多次調用 getSimilarity。 這樣它會更有效率，從而更快。

import collections

similarity_results = collections.defaultdict(dict)

for symptom in symptoms:
   norm_symptom = symptom.replace("_"," ")
   for combin in list_of_combinations:
       # Check if the similarity has already been computed
       if combin in similarity_results[symptom]:
          similarity = similarity_results[symptom][combin]
       else:
          similarity = getSimilarity([combin, norm_symptom])
          similarity_results[symptom][combin] = similarity
   if similarity > 0.25:
       if symptom not in extracted_symptoms:
           extracted_symptoms.append(symptom)

Answer 2

事實上，由於 2 個嵌套循環，你的算法很慢。

它以大 ON*M 執行（在此處查看更多信息https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ ）

N 是症狀的長度，M 是 list_of_combinations

會花時間的也是計算getSimilarity ，這個操作是什么？

我如何重構我的 python 代碼以降低時間復雜度

問題描述

2 個解決方案

解決方案1
0 2022-12-21 11:26:26

解決方案2
0 2022-12-21 11:28:33

我如何重構我的 python 代碼以降低時間復雜度

問題描述

2 個解決方案

解決方案1 0 2022-12-21 11:26:26

解決方案2 0 2022-12-21 11:28:33

解決方案1
0 2022-12-21 11:26:26

解決方案2
0 2022-12-21 11:28:33