简体   繁体   中英

How can I implement a fuzzy search across each value of a dictionary in a multiple dictionary list?

I have a list of dictionaries. I am trying to implement a 'fuzzy' search of said dictionary values and have the full dictionary returned.

Therefore, if I have a list of dicts as follows:

[
{"Name":"Arnold", "Age":"52", "Height":"160"}, 
{"Name":"Donald", "Age":"52", "Height":"161"}, 
{"Name":"Trevor", "Age":"22", "Height":"150"}
]

A search term of " nol " should return

{"Name":"Arnold", "Age":"52", "Height":"160"} 

While a search term of " 52 " should return:

{"Name":"Arnold", "Age":"52", "Height":"160"} 
{"Name":"Donald", "Age":"52", "Height":"161"}

I understand that I can search for values at a particular key using iteritems, I'm just not clear on how to search across all key/values in a dictionary (without knowing the keyname), and then return said dictionary if there is a match in any. Is this possible in python?

You can use something like

>>> l = [
... {"Name":"Arnold", "Age":"52", "Height":"160"}, 
... {"Name":"Donald", "Age":"52", "Height":"161"}, 
... {"Name":"Trevor", "Age":"22", "Height":"150"}
... ]
>>>
>>> [d for d in l if any("nol" in v for v in d.values())]
[{'Age': '52', 'Name': 'Arnold', 'Height': '160'}]
>>>
>>> [d for d in l if any("52" in v for v in d.values())]
[{'Age': '52', 'Name': 'Arnold', 'Height': '160'}, {'Age': '52', 'Name': 'Donald', 'Height': '161'}]

Another slightly different option:

searchTerm = "nol"
unusedCharacter = "\n"  # This should be a character that will never appear in your search string.
# Changed this to a generator to avoid searching the whole dict all at once:
results = (d for d in l if searchTerm in unusedCharacter.join(d.values()))

# Produce a limited number of results:
limitedResults = []
maxResults = 5
for k, result in enumerate(results):
    if k == maxResults:
        break
    limitedResults.append(result)

Here is my version, which doesn't save all the results at the same time in a list, but instead generates them as needed.

import itertools

database = [
    {"Name":"Arnold", "Age":"52", "Height":"160"}, 
    {"Name":"Donald", "Age":"52", "Height":"161"}, 
    {"Name":"Trevor", "Age":"22", "Height":"150"},
]

def search(s):
    s = s.lower() # it is a nice feature to ignore case
    for item in database:
        if any(s in v.lower() for v in item.values()): # if any value contains s
            yield item # spit out the item — this is a generator function

# iterate over at most 5 first results
for result in itertools.islice(search("52"), 5):   
    print(result)
{'Height': '160', 'Age': '52', 'Name': 'Arnold'}
{'Height': '161', 'Age': '52', 'Name': 'Donald'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM