简体   繁体   中英

Python: Fast way to update value in dict based on length of text?

I have a list of sets like so. I basically want to convert this to a dictionary and to address duplicate keys, I want to take the text value which is longer in length:

[('hong kong', 'state'),
 ('hong kong', 'city'),
 ('hong', 'country'),
 ('kong', 'city'),
 ('hong kong', 'country')]

So the desired result would be:

{'state': 'hong kong',
 'city': 'hong kong',
 'country': 'hong kong'}

I have a function that does this but I'm sure there's a better more efficient & pythonic way to do this. Here's what I've done:

def create_dict(l):
    d=defaultdict(list)
    for s in l:
        key = s[1]
        val = s[0]

        if d[key]:
            if len(val) > len(d[key]):
                d[key] = val
        else:
            d[key] = val
        
    return d

Here is how you can use the sorted method with a custom key:

lst = [('hong kong', 'state'),
       ('hong kong', 'city'),
       ('hong', 'country'),
       ('kong', 'city'),
       ('hong kong', 'country')]

def create_dict(l):
    sorted_lst = sorted(l, key=lambda x: len(x[0]))
    return {k: v for v, k in sorted_lst}

print(create_dict(lst))

Output:

{'country': 'hong kong', 'city': 'hong kong', 'state': 'hong kong'}

How's this?

lst = [('hong kong', 'state'),
 ('hong kong', 'city'),
 ('hong', 'country'),
 ('kong', 'city'),
 ('hong kong', 'country')]

output = {}
for value, key in lst:
    if len(output.setdefault(key, value)) < len(value):
        output[key] = value

The sorted method above @Ann Zen is cleaner because you don't have to import defaultdict from collections, but this is a somewhat more Pythonic version of your original code:

def create_dict(l)
    d = defaultdict(list)
    for value, k in l:           
        d[k].append(value)
    return {k: max(d[k], key=len) for k in d.keys()}

Here we unpack each tuple in the passed list as value, k , to build the defaultdict(list) , rather than doing explicit assignment by index. Then instead of using a loop to find the longest string in each list, and then building the dict in an if/else statement, just pull out the longest string using the max() function, keyed to string length, and wrap that all in a dictionary generator expression which is returned directly. This returns:

{'state': 'hong kong', 'city': 'hong kong', 'country': 'hong kong'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM