Multiple criteria search in a list in Python

Question

dataset:

data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'}, 
        {'name':'louis', 'attack':21, 'defense': 12, 'country':'france'}, 
        {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

header = ['name', 'attack', 'defense', 'country']

filter_options = {'attack':4, 'defense':7, 'country':'Germany'}

I would like to write a function whereby data is the argument and filter_options is the parameters of the function. ie func(data, filter_options)

The filter_options will filter by exact match for string type values, and/or filter continuous variables specified value that is greater than or equal to the dictionary key parameter. ie my answer should be

answer = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'},
          {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

my current code:

search_key_list = [key for key in filter_options.keys()]
header_index_list = [header.index(i) for i in search_key_list if i in header]

answer = []
for i in header_index_list:
    for d in data:
        if type(filter_options[header[i]]) == int or type(filter_options[header[i]]) == float:
            if data[header[i]]>filter_options[header[i]]:
                answer.append(d)
        elif type((filter_options[header[i]])) == str:
            if data[header[i]] == filter_options[header[i]]:
                answer.append(d)

The code is wrong because it is not considering the multiple criteria. It is looking at one criteria, checking which sublist fits the criteria, append the sublist to the answer list and then moving on to the next criteria.

How can I correct this? Or what other codes will work?

Answer 1

You need to check all "filters" and only append it if all of the filters match the dataset:

data = [{'name':'kelly', 'attack':5, 'defense':10, 'country':'Germany'}, 
        {'name':'louis', 'attack':21, 'defense': 12, 'country':'france'}, 
        {'name':'ann', 'attack':43, 'defense':9, 'country':'Germany'}]

header = ['name', 'attack', 'defense', 'country']

filter_options = {'attack':4, 'defense':7, 'country':'Germany'}


def filter_data(data, filter_options):
    answer = []
    for data_dict in data:
        for attr, value in filter_options.items():  # or iteritems
            if isinstance(value, (int, float)):     # isinstance is better than "type(x) == int"!
                if data_dict[attr] < value:         # check if it's NOT a match
                    break                           # stop comparing that dictionary
            elif isinstance(value, str):
                if data_dict[attr] != value:
                    break
        # If there was no "break" during the loop the "else" of the loop will
        # be executed
        else:
            answer.append(data_dict)
    return answer


>>> filter_data(data, filter_options)
[{'attack': 5, 'country': 'Germany', 'defense': 10, 'name': 'kelly'},
 {'attack': 43, 'country': 'Germany', 'defense': 9, 'name': 'ann'}]

The trick here is that it checks if it's smaller (in case it's an integer) or unequal (for strings) and then immediately stops comparing that dictionary and when the loop wasn't break ed and only then it appends the dictionary.

Another way without using an else clause for the loop would be:

def is_match(single_data, filter_options):
    for attr, value in filter_options.items():
        if isinstance(value, (int, float)):
            if single_data[attr] < value:
                return False
        elif isinstance(value, str):
            if single_data[attr] != value:
                return False
    return True

def filter_data(data, filter_options):
    answer = []
    for data_dict in data:
        if is_match(data_dict, filter_options):
            answer.append(data_dict)
    return answer

filter_data(data, filter_options)

You could also use a generator function instead of manual appends (based on the first approach):

def filter_data(data, filter_options):
    for data_dict in data:
        for attr, value in filter_options.items():
            if isinstance(value, (int, float)):
                if data_dict[attr] < value: 
                    break          
            elif isinstance(value, str):
                if data_dict[attr] != value:
                    break
        else:
            yield data_dict
    return answer

However that requires casting it a list afterwards:

>>> list(filter_data(data, filter_options))

Multiple criteria search in a list in Python

Question

1 answers

solution1
0 ACCPTED 2017-09-02 18:53:53

Multiple criteria search in a list in Python

Question

1 answers

solution1 0 ACCPTED 2017-09-02 18:53:53

solution1
0 ACCPTED 2017-09-02 18:53:53