[英]Python - reading text file into dictionary
我有很多要從文本文件中提取的術語列表,並將它們分為以下組之一:動物,藝術,建築物,車輛,人,人,食品,玻璃,瓶,標牌,標語,DJ , 派對。 我目前在tester2文件中有四個詞:
盤比薩擔心攪拌機
這是我的代碼:
keyword_dictionary = {
'Animal' : ['animal', 'dog', 'cat'],
'Art' : ['art', 'sculpture', 'fearns'],
'Buildings' : ['building', 'architecture', 'gothic', 'skyscraper'],
'Vehicle' : ['car','formula','f-1','f1','f 1','f one','f-one','moped','mo ped','mo-ped','scooter'],
'Person' : ['person','dress','shirt','woman','man','attractive','adult','smiling','sleeveless','halter','spectacles','button','bodycon'],
'People' : ['people','women','men','attractive','adults','smiling','group','two','three','four','five','six','seven','eight','nine','ten','2','3','4','5','6','7','8','9','10'],
'Food' : ['food','plate','chicken','steak','pizza','pasta','meal','asian','beef','cake','candy','food pyramid','spaghetti','curry','lamb','sushi','meatballs','biscuit','apples','meat','mushroom','jelly', 'sorbet','nacho','burrito','taco','cheese'],
'Glass' : ['glass','drink','container','glasses','cup'],
'Bottle' : ['bottle','drink'],
'Signage' : ['sign','martini','ad','advert','card','bottles','logo','mat','chalkboard','blackboard'],
'Slogan' : ['Luck is overrated'],
'DJ' : ['dj','disc','jockey','mixer','instrument','turntable'],
'Party' : ['party']
}
y = 0
while (y < 1):
try:
def search(keywords, searchFor):
for item in keywords:
for terms in keywords[item]:
if searchFor in terms:
print item
with open("C:/Users/USERNAME/Desktop/tester2.txt") as termsdesk:
for line in termsdesk:
this = search (keyword_dictionary, line)
this2 = str(this)
#print this2
#print item
except KeyError:
break
y = y+1
我的結果應如下所示:
Food
Food
Art
DJ
但是我得到了這個:
DJ
我想這是因為我的循環有問題。 有人知道我需要更改嗎? 我已經嘗試過移動“ while(y <1)”,但是卻無法獲得想要的結果。
從搜索詞中刪除前導/尾隨空格。 預期的工作如下:
def search(keywords, searchFor):
for key, words in keywords.iteritems():
if searchFor in words:
print key
with open("tester2.txt") as termsdesk:
for line in termsdesk:
this = search(keyword_dictionary, line.strip())
this2 = str(this)
$ cat tester2.txt
plate
pizza
fearns
mixer
$ python test4.py
Food
Food
Art
DJ
另外,如果您希望搜索項的數量相對於字典的大小而言較大,則可以考慮提高性能:您可以建立從任何單詞到其類別的反向映射 。 例如轉換:
keyword_dict = {'DJ': ['mixer', 'speakers']}
成
category_dict = {
'mixer': 'DJ',
'speakers':'DJ'
}
可以在開始時構建一次反向映射,然后將其用於每個查詢,這樣您的搜索功能就變成了category_dict[term]
。 這樣,查找將更快,攤銷O(1)復雜度並且更容易編寫。
以下方法更有意義:
keyword_dictionary = {
'Animal' : ['animal', 'dog', 'cat'],
'Art' : ['art', 'sculpture', 'fearns'],
'Buildings' : ['building', 'architecture', 'gothic', 'skyscraper'],
'Vehicle' : ['car','formula','f-1','f1','f 1','f one','f-one','moped','mo ped','mo-ped','scooter'],
'Person' : ['person','dress','shirt','woman','man','attractive','adult','smiling','sleeveless','halter','spectacles','button','bodycon'],
'People' : ['people','women','men','attractive','adults','smiling','group','two','three','four','five','six','seven','eight','nine','ten','2','3','4','5','6','7','8','9','10'],
'Food' : ['food','plate','chicken','steak','pizza','pasta','meal','asian','beef','cake','candy','food pyramid','spaghetti','curry','lamb','sushi','meatballs','biscuit','apples','meat','mushroom','jelly', 'sorbet','nacho','burrito','taco','cheese'],
'Glass' : ['glass','drink','container','glasses','cup'],
'Bottle' : ['bottle','drink'],
'Signage' : ['sign','martini','ad','advert','card','bottles','logo','mat','chalkboard','blackboard'],
'Slogan' : ['Luck is overrated'],
'DJ' : ['dj','disc','jockey','mixer','instrument','turntable'],
'Party' : ['party']
}
terms = {v2 : k for k, v in keyword_dictionary.items() for v2 in v}
with open('input.txt', 'r') as f_input:
for word in f_input:
print terms[word.strip()]
首先,您將使用現有字典並對其進行反轉,以使其更容易查找每個單詞。
這將為您提供以下輸出:
Food
Food
Art
DJ
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.