简体   繁体   中英

How to search through a nested list with a dictionary in Python?

In Python 3.6, I have a list like the one below and can't figure out how to properly search through the values. So, if I am given the search string below, I need to search through the values for both title and tags and whichever one has most matches, I would return the id for and if there were many different images (ids) with the same amount of matches, then the one whose title comes first alphabetically would be returned. Also, it is supposed to not be casesensitive. So in the code I have search as my term to search and it should return the first id value, but instead is returning different values.

image_info = [
{
    "id" : "34694102243_3370955cf9_z",
    "title" : "Eastern",
    "flickr_user" : "Sean Davis",
    "tags" : ["Los Angeles", "California", "building"]
},
{
    "id" : "37198655640_b64940bd52_z",
    "title" : "Spreetunnel",
    "flickr_user" : "Jens-Olaf Walter",
    "tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
    "id" : "34944112220_de5c2684e7_z",
    "title" : "View from our rental",
    "flickr_user" : "Doug Finney",
    "tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
    "id" : "36140096743_df8ef41874_z",
    "title" : "Someday",
    "flickr_user" : "Thomas Hawk",
    "tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}

]

my_counter = 0
search = "CAT IN BUILding"
search = search.lower().split()
matches = {}

for image in image_info:
    for word in search:
        word = word.lower()
        if word in image["title"].lower().split(" "):
            my_counter += 1
            print(my_counter)
        if word in image["tags"]:
            my_counter +=1
            print(my_counter)
    if my_counter > 0:
        matches[image["id"]] = my_counter
        my_counter = 0

You are creating new entry in dictionary matches[image["id"]] = my_counter. If you want to keep only 1 entry in dictionary for that search term and you want image_id and count. I have modified your dict and condition. Hope it helps.

my_counter = 0
search_term = "CAT IN BUILding"
search = search_term.lower().split()
matches = {}
matches[search_term] = {}

for image in image_info:
    for word in search:
        word = word.lower()
        if word in image["title"].lower().split(" "):
            my_counter += 1
            print(my_counter)
        if word in image["tags"]:
            my_counter +=1
            print(my_counter)
    if my_counter > 0:
        if not matches[search_term].values() or my_counter > matches[search_term].values()[0]:
            matches[search_term][image["id"]] = my_counter

        my_counter = 0

This a variation of code where I have attempted to pre-index the data before doing search. This a very rudimentary implementation of how CloudSearch or ElasticSearch would index and search

import itertools
from collections import Counter
image_info = [
{
    "id" : "34694102243_3370955cf9_z",
    "title" : "Eastern",
    "flickr_user" : "Sean Davis",
    "tags" : ["Los Angeles", "California", "building"]
},
{
    "id" : "37198655640_b64940bd52_z",
    "title" : "Spreetunnel",
    "flickr_user" : "Jens-Olaf Walter",
    "tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
    "id" : "34944112220_de5c2684e7_z",
    "title" : "View from our rental",
    "flickr_user" : "Doug Finney",
    "tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
    "id" : "36140096743_df8ef41874_z",
    "title" : "Someday",
    "flickr_user" : "Thomas Hawk",
    "tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]

my_counter = 0
search = "CAT IN BUILding california"
search = set(search.lower().split())
matches = {}

index = {}


# Building a rudimentary search index
for info in image_info:
    bag = info["title"].lower().split(" ")
    tags = [t.lower().split(" ") for t in info["tags"]] # we want to be able to hit "los angeles" as will as "los"  and "angeles"
    tags = list(itertools.chain.from_iterable(tags))
    for k in (bag + tags):
        if k in index:
            index[k].append(info["id"])
        else:
            index[k] = [info["id"]]

#print(index)

hits = []

for s in search:
    if s in index:
        hits += index[s]
print(Counter(hits).most_common(1)[0][0])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM