[英]how can i refactor my python code to decrease the time complexity
此代碼需要 9 秒,這是很長的時間,我猜我的代碼中的 2 個循環中存在問題
for symptom in symptoms:
# check if the symptom is mentioned in the user text
norm_symptom = symptom.replace("_"," ")
for combin in list_of_combinations:
print(getSimilarity([combin, norm_symptom]))
if getSimilarity([combin,norm_symptom])>0.25:
if symptom not in extracted_symptoms:
extracted_symptoms.append(symptom)
我試圖像這樣使用 zip:
for symptom, combin in zip(symptoms,list_of_combinations):
norm_symptom = symptom.replace("_"," ")
if (getSimilarity([combin, norm_symptom]) > 0.25 and symptom not in extracted_symptoms):
extracted_symptoms.append(symptom)
使用字典存儲每個組合和症狀的 getSimilarity 結果。 這樣,您可以避免為相同的組合和症狀多次調用 getSimilarity。 這樣它會更有效率,從而更快。
import collections
similarity_results = collections.defaultdict(dict)
for symptom in symptoms:
norm_symptom = symptom.replace("_"," ")
for combin in list_of_combinations:
# Check if the similarity has already been computed
if combin in similarity_results[symptom]:
similarity = similarity_results[symptom][combin]
else:
similarity = getSimilarity([combin, norm_symptom])
similarity_results[symptom][combin] = similarity
if similarity > 0.25:
if symptom not in extracted_symptoms:
extracted_symptoms.append(symptom)
事實上,由於 2 個嵌套循環,你的算法很慢。
它以大 ON*M 執行(在此處查看更多信息https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/ )
N 是症狀的長度,M 是 list_of_combinations
會花時間的也是計算getSimilarity
,這個操作是什么?
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.