简体   繁体   中英

Find value in the dictionary that matches the condition

I am trying to set a value from the dict if the condition is met. Basically I iterate over the values of my dictionary and check if they fit my condition (then I break to loop which is not the best practice but saves some iterations)

Here is the code I am using:

for (key,value) in zip(mn_zone_dict.keys(), mn_zone_dict.values()):
    if cost < value:
        zone = key
        break

I does it's job but it is relatively slow while I have to check > 10k records so I am looking for some smarter (and maybe more pythonic) way of solving that task. I have seen a function any() but it only returns if there is such entry which matches the conditions without telling which.

I'd be happy to hear your ideas and suggestions.

If you have the data directly as-is, with only a dictionary structure, you will have to iterate over it every time. The best speedup you can get would be to use a comprehension instead of a loop, and dict.items instead of zip :

zones = [k for k, v in my_zone_dict.items() if cost < v]

On the one hand this iterates over the whole dict. On the other, it tells you immediately how many values met the criterion, if any.

The problem here is that no matter how much less overhead a comprehension has than an explicit loop, this is still O(n) for every lookup. The correct solution is to use a different, or complimentary data structure. Since you want value to be greater than something, I'd recommend a max-heap.

Python implements heaps in the heapq module. It is a bit unusual because it does not provide a heap object, just functions to heapify and maintain lists as heaps. Also, only min-heaps are supported, but that's OK because you can always just negate your values:

my_zone_list = [(-v, k) for k, v in my_zone_dict.items()]
heapq.heapify(my_zone_list)

This is a one-time O(n) penalty, which you never have to repeat. Your whole loop now becomes an O(1) operation:

if cost < -my_zone_list[0][0]:
    zone = my_zone_list[0][1]

Inserting new elements has an O(log(n)) cost:

heapq.heappush(my_zone_list, (-new_value, new_key))

As a side note, if you can't introduce new data structures, you might get better performance with

v, zone = max((v, k) for k, v in my_zone_dict.items())
if cost < v: ...

Instead of iterating a zip object, you can iterate directly over dict.items :

for k, v in my_zon_dict.items():
    if cost < v:
        zone = k
        break

This will be more efficient than creating a custom zipped iterable.

You can do it like this:

for k in my_zon_dict.keys():
   if cost < my_zon_dict[k]:
      zone = k
      break

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM