[英]How to search through a nested list with a dictionary in Python?
在 Python 3.6 中,我有一個如下所示的列表,但不知道如何正確搜索這些值。 所以,如果我得到下面的搜索字符串,我需要搜索標題和標簽的值,無論哪個匹配最多,我都會返回 id,如果有許多不同的圖像(id)具有相同的數量匹配,然后將返回標題按字母順序排在第一位的那個。 此外,它應該不區分大小寫。 因此,在代碼中,我將搜索作為要搜索的術語,它應該返回第一個 id 值,但返回不同的值。
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding"
search = search.lower().split()
matches = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
matches[image["id"]] = my_counter
my_counter = 0
您正在字典中創建新條目matches[image["id"]] = my_counter。 如果您只想在字典中為該搜索詞保留 1 個條目,並且您想要 image_id 和 count。 我已經修改了你的字典和條件。 希望能幫助到你。
my_counter = 0
search_term = "CAT IN BUILding"
search = search_term.lower().split()
matches = {}
matches[search_term] = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
if not matches[search_term].values() or my_counter > matches[search_term].values()[0]:
matches[search_term][image["id"]] = my_counter
my_counter = 0
這是我嘗試在進行搜索之前預先索引數據的代碼變體。 這是CloudSearch或ElasticSearch如何索引和搜索的一個非常基本的實現
import itertools
from collections import Counter
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding california"
search = set(search.lower().split())
matches = {}
index = {}
# Building a rudimentary search index
for info in image_info:
bag = info["title"].lower().split(" ")
tags = [t.lower().split(" ") for t in info["tags"]] # we want to be able to hit "los angeles" as will as "los" and "angeles"
tags = list(itertools.chain.from_iterable(tags))
for k in (bag + tags):
if k in index:
index[k].append(info["id"])
else:
index[k] = [info["id"]]
#print(index)
hits = []
for s in search:
if s in index:
hits += index[s]
print(Counter(hits).most_common(1)[0][0])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.