簡體   English   中英

我如何重構我的 python 代碼以降低時間復雜度

[英]how can i refactor my python code to decrease the time complexity

此代碼需要 9 秒,這是很長的時間,我猜我的代碼中的 2 個循環中存在問題


for symptom in symptoms:
  # check if the symptom is mentioned in the user text
  norm_symptom = symptom.replace("_"," ")
  for combin in list_of_combinations:
    print(getSimilarity([combin, norm_symptom]))
    if getSimilarity([combin,norm_symptom])>0.25:
       if symptom not in extracted_symptoms:
        extracted_symptoms.append(symptom)

我試圖像這樣使用 zip:

for symptom, combin in zip(symptoms,list_of_combinations):
  norm_symptom = symptom.replace("_"," ")
  if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
    extracted_symptoms.append(symptom)

使用字典存儲每個組合和症狀的 getSimilarity 結果。 這樣,您可以避免為相同的組合和症狀多次調用 getSimilarity。 這樣它會更有效率,從而更快。

import collections

similarity_results = collections.defaultdict(dict)

for symptom in symptoms:
   norm_symptom = symptom.replace("_"," ")
   for combin in list_of_combinations:
       # Check if the similarity has already been computed
       if combin in similarity_results[symptom]:
          similarity = similarity_results[symptom][combin]
       else:
          similarity = getSimilarity([combin, norm_symptom])
          similarity_results[symptom][combin] = similarity
   if similarity > 0.25:
       if symptom not in extracted_symptoms:
           extracted_symptoms.append(symptom)

事實上,由於 2 個嵌套循環,你的算法很慢。

它以大 ON*M 執行(在此處查看更多信息https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/

N 是症狀的長度,M 是 list_of_combinations

會花時間的也是計算getSimilarity ,這個操作是什么?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM