简体   繁体   中英

How to filter and remove dict elements based on a value threshold?

I have several dictionaries in a list with this structure:

[{'store': 'walmart',
  'store_id': 0,
  'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
   'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
   'oranges': {'availability': {'No': 2, 'Yes': 2}},
   'bottled water': {'availability': {'No': 10, 'Yes': 5}},
   "india's mangos": {'availability': {'No': 3, 'Yes': 5}},
   'water melon': {'availability': {'No': 2, 'Yes': 2}},
   'lemons': {'availability': {'No': 2, 'Yes': 3}},
   'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
   'pineapple': {'availability': {'No': 5, 'Yes': 20}},
   'total_yes': 23,
   'total_no': 23,
   'total': 46,
   'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
{'store': 'Costco',
  'store_id': 24,
  'store_info': {'papaya': {'availability': {'No': 1, 'Yes': 1}},
   'lychee': {'availability': {'No': 5, 'Yes': 1}},
   'fig': {'availability': {'No': 2, 'Yes': 2}},
   'blackberry': {'availability': {'No': 2, 'Yes': 5}},
   "india's mangos": {'availability': {'No': 3, 'Yes': 5}},
   'plum': {'availability': {'No': 1, 'Yes': 2}},
   'total_yes': 43,
   'total_no': 3,
   'total': 46,
   'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}  
]

How can I filter all the Yes and No values which are greater or equal to 5 at the same time? For example, given the above dict. The expected output should look like this if the dictionary fullfil the condition:

[
{'store': 'walmart',
  'store_id': 0,
  'store_info': {
  'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
  'bottled water': {'availability': {'No': 10, 'Yes': 5}},
  'pineapple': {'availability': {'No': 5, 'Yes': 20}},
  'total_yes': 23,
  'total_no': 23,
  'total': 46,
  'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}
  }
]

In the above example, 'india's mangos': {'availability': {'No': 3, 'Yes': 5}} should be filtered or removed. Because, although the 5 fullfil Yes treshold, the key No, doesnt fulfill the treshold at the same time. Alternatively, 'pineapple': {'availability': {'No': 5, 'Yes': 20}} , should remain in the dict, because Yes key has as values 20, which is greater than the 5 threshold. Finally, the second dict (costco) should be removed because none of its keys are at leas 5.

So far I tried to iterate over the structure, however, I am making too many loops, is there a more compact way of getting the expected output?:

a_lis = []
for e in list_dict:
    try:
        l = list(e['store_info'].keys())
        for i in l:
            #print(e['store_info'][i]['availability'])
            if e['store_info'][i]['availability']['No']>=5 and e['availability'][i]['availability']['Yes']>= 5:
                a_lis.append(e['store_info'][i]['availability'])
                print(a_lis)
            else:
                pass
    except TypeError:
        pass

That's not difficult.I would recommend you create a new list.(And revise the dictionary directly.)

lst = [{'store': 'walmart',
        'store_id': 0,
        'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
                       'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
                       'oranges': {'availability': {'No': 2, 'Yes': 2}},
                       'bottled water': {'availability': {'No': 10, 'Yes': 5}},
                       'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
                       'water melon': {'availability': {'No': 2, 'Yes': 2}},
                       'lemons': {'availability': {'No': 2, 'Yes': 3}},
                       'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
                       'pineapple': {'availability': {'No': 5, 'Yes': 20}},
                       'total_yes': 23,
                       'total_no': 23,
                       'total': 46,
                       'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
       {'store': 'Costco',
        'store_id': 24,
        'store_info': {
            'papaya': {'availability': {'No': 1, 'Yes': 1}},
                       'lychee': {'availability': {'No': 5, 'Yes': 1}},
                       'fig': {'availability': {'No': 2, 'Yes': 2}},
                       'blackberry': {'availability': {'No': 2, 'Yes': 5}},
                       'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
                       'plum': {'availability': {'No': 1, 'Yes': 2}},
                       'total_yes': 43,
                       'total_no': 3,
                       'total': 46,
                       'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}
       ]

result_list = []
for sub_dict in lst:
    if sub_dict['store_info']['total_yes'] >= 5 and sub_dict['store_info']['total_no'] >= 5:
        result_list.append(sub_dict)
        key_need_to_be_removed = [k for k, v in sub_dict['store_info'].items() if type(v) is dict and (v['availability']['Yes'] < 5 or v['availability']['No'] < 5)]
        for k in key_need_to_be_removed: # remove the dict under dictionary['store_info']
            del sub_dict['store_info'][k]

print(result_list)

Result:

[{
    'store': 'walmart',
    'store_id': 0,
    'store_info': {
        'tomatoes': {
            'availability': {
                'No': 5,
                'Yes': 6
            }
        },
        'bottled water': {
            'availability': {
                'No': 10,
                'Yes': 5
            }
        },
        'pineapple': {
            'availability': {
                'No': 5,
                'Yes': 20
            }
        },
        'total_yes': 23,
        'total_no': 23,
        'total': 46,
        'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]
    }
}]

Here is another approach:

# where data is the input
filtered = []

for store in data:
    avail_dict = {}
    extra_dict = {}
    for item, value in store['store_info'].items():
        if isinstance(value, dict):
            okay = value['availability'].get('No',0) >= 5 and value['availability'].get('Yes',0) >= 5
            if okay:
                avail_dict[item] = value
        else:
            extra_dict[item] = value
    if avail_dict:
        avail_dict.update(extra_dict)
        new_store = dict(store)
        new_store['store_info'] = avail_dict
        filtered.append(new_store)

Result for filtered (input data is unchanged):

[{'store': 'walmart',
  'store_id': 0,
  'store_info': {'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
   'bottled water': {'availability': {'No': 10, 'Yes': 5}},
   'pineapple': {'availability': {'No': 5, 'Yes': 20}},
   'total_yes': 23,
   'total_no': 23,
   'total': 46,
   'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM