简体   繁体   中英

Convert every dictionary value to utf-8 (dictionary comprehension?)

I have a dictionary and I want to convert every value to utf-8. This works, but is there a "more pythonic" way?

            for key in row.keys():
                row[key] = unicode(row[key]).encode("utf-8")

For a list I could do

[unicode(s).encode("utf-8") for s in row]

but I'm not sure how to do the equivalent thing for dictionaries.

This is different from Python Dictionary Comprehension because I'm not trying to create a dictionary from scratch, but from an existing dictionary. The solutions to the linked question do not show me how to loop through the key/value pairs in the existing dictionary in order to modify them into new k/v pairs for the new dictionary. The answer (already accepted) below shows how to do that and is much clearer to read/understand for someone who has a task similar to mine than the answers to the linked related question, which is more complex.

Use a dictionary comprehension . It looks like you're starting with a dictionary so:

 mydict = {k: unicode(v).encode("utf-8") for k,v in mydict.iteritems()}

The example for dictionary comprehensions is near the end of the block in the link.

As I had this problem as well, I built a very simple function that allows any dict to be decoded in utf-8 (The problem with the current answer is that it applies only for simple dict).

If it can help anyone, it is great, here is the function :

def utfy_dict(dic):
    if isinstance(dic,unicode):
        return(dic.encode("utf-8"))
    elif isinstance(dic,dict):
        for key in dic:
            dic[key] = utfy_dict(dic[key])
        return(dic)
    elif isinstance(dic,list):
        new_l = []
        for e in dic:
            new_l.append(utfy_dict(e))
        return(new_l)
    else:
        return(dic)

那个由That1Guy提供的答案的Python 3版本。

{k: str(v).encode("utf-8") for k,v in mydict.items()}

It depends why you're implicitly encoding to UTF-8. If it's because you're writing to a file, the pythonic way is to leave your strings as Unicode and encode on output:

with io.open("myfile.txt", "w", encoding="UTF-8") as my_file:
    for (key, values) in row.items():
        my_string = u"{key}: {value}".format(key=key, value=value)
        my_file.write(my_string)

如果您要执行以下操作,则可以仅遍历键:

{x:unicode(a[x]).encode("utf-8") for x in a.keys()}

Best approach to convert non-ascii dictionary value in ascii characters is

mydict = {k: unicode(v, errors='ignore').encode('ascii','ignore') for k,v in mydict.iteritems()} 

Best approach to convert non-utf-8 dictionary value in utf-8 characters is

mydict = {k: unicode(v, errors='ignore').encode('utf-8','ignore') for k,v in mydict.iteritems()}

For more reference read python unicode documentation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM