简体   繁体   中英

Faster method to conditionally search nested dictionary Python

I have a nested dictionary that I am attempting to conditionally search through to grab some values. I am iterating through another file and grabbing val1, val2, and val3. From there I search through the dictionary to find an ID based on some conditions.

However, for 55M rows of data this is very expensive. I cannot find anywhere a faster way to do this and I am putting this on a spark job. I tried to have it so that if an ID was found we stopped searching through the dict, but im unsure if I did this correctly.

It appears I go through every key in the dictionary to find values, not sure how to optimize this. Any help is appreciated. Here is the code:

for key, val in dict[val1].items():
    if key[0]==val2 or key[1] == val2:
        if len(val3)==1:
            if val3[0]%2==0:
                for key2, val2 in val.items():
                    if key2[2]<=val3[0] and key2[3]>=val3[0]:
                        ID = val2[0]
            if val3[0]%2!=0:
                for key2, val2 in val.items():
                    if key2[0]<=val3[0] and key2[1]>=val3[0]:
                        ID = val2[0]
      if ID!=None:
          break

edit: Input values are like this

val1 = zone#
val2 = 'name'
val3 = score in tuple form like (2,)

and the nested dictionary looks something like this:

{3: defaultdict(<function __main__.<lambda>.<locals>.<lambda>()>,
                         {('jeff', 'jeff A'): defaultdict(list,
                                      {(23,
                                        41,
                                        28,
                                        40,): [61814],

@Gal posted the answer but here is what the set-up now looks like. Runs over 10 times faster than the for loop iteration.

if val2 in dict[val1]:
            if len(val3)==1:
                if val3[0]%2==0:
                    for key2, val5 in dict[val1][val2].items():
                        if key2[2]<=val3[0] and key2[3]>=val3[0]:
                            ID = val5[0]

to meet the condition of val2 having two possible outcomes we create two dict's and run this check twice, its much faster now.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM