简体   繁体   English

如何从python中的Dictionary数据中删除unicode字符

[英]How to remove unicode characters from Dictionary data in python

After using request library , am getting below dict in response.json() 使用请求库后,我在response.json()中得到了dict

{u'xyz': {u'key1': None, u'key2': u'Value2'}}

I want to remove all unicode characters and print only key value pairs without unicode chars 我想删除所有unicode字符并仅打印键值对而不使用unicode字符

I have tried below method to remove , but it shows malformed string 我尝试过以下方法删除,但它显示格式错误的字符串

>>> import json, ast
>>> c = {u'xyz': {u'key1': None,u'key2': u'Value2'}}
>>> ast.literal_eval(json.dumps(c))

Getting 'ValueError: malformed string' 获取'ValueError:格式错误的字符串'

Any suggestion on how to do it ? 关于如何做的任何建议?

Change your None to 'None': 将您的无更改为'无':

 c = {u'xyz': {u'key1': 'None', u'key2': u'Value2'}}

it is a casting issue - ast likes str's 这是一个铸造问题 - 就像str一样

Also, maybe u want to change all None to empty str or 'None' str... See this thread : Python: most idiomatic way to convert None to empty string? 此外,也许你想要将所有None更改为空str或'None'str ...请参阅此主题: Python:将无转换为空字符串的最惯用方式? with this code, i've changes the empty string to 'None': 使用此代码,我将空字符串更改为“无”:

def xstr(s):
    if s is None:
        return 'None'
    return str(s)

This snippet will helps you to preserve the data without unicode prefix notation u : 此代码段将帮助您保留数据,而不使用unicode前缀表示法u

>>> import json
>>> c = {u'xyz': {u'key1': u'Value1',u'key2': u'Value2'}}
>>> print c
{u'xyz': {u'key2': u'Value2', u'key1': u'Value1'}}
>>> d = eval(json.dumps(c))
>>> print d
{'xyz': {'key2': 'Value2', 'key1': 'Value1'}}

json.dumps() will convert the dict to string type and eval() will reverse it. json.dumps()会将dict转换为字符串类型,而eval()会将其反转。

Note: key1 value has changed from None to 'value1' for testing purpose 注意:为了测试目的,key1值已从None更改为'value1'

You can use unicodestring.encode("ascii","replace") 你可以使用unicodestring.encode("ascii","replace")

>>> ustr=u'apple'
>>> ustr
u'apple'
>>> astr=ustr.encode("ascii","replace")
>>> astr
'apple'

I don't really understand why you want this. 我真的不明白你为什么要这样。 Your variable is a normal Python dict with normal Unicode strings, and they happen to be printed as u'' to distinguish them from bytestrings, but that shouldn't matter for using them. 你的变量是一个普通的Python字典,带有普通的Unicode字符串,它们碰巧被打印为u''以区别于字节串,但这对于使用它们无关紧要。

If you want to save them as strings to read them as data later, JSON is a fine format for that. 如果你想将它们保存为字符串以便以后将它们作为数据读取,那么JSON就是一种很好的格式。 So no need to call request's .json() function at all, just use the response's .text attribute -- it's already JSON, after all. 因此,根本不需要调用request的.json()函数,只需使用response的.text属性 - 毕竟它已经是JSON了。

Your try 你的尝试

>>> ast.literal_eval(json.dumps(c))

Fails because you first turn c into JSON again, and then try to parse it as a Python literal. 失败是因为您首先将c再次转换为JSON,然后尝试将其解析为Python文字。 That doesn't work because Python isn't JSON; 这不起作用,因为Python不是JSON; in particular one has null and the other has None . 特别是一个有null而另一个有None

So maybe you want to change the Unicode strings into bytestrings? 那么也许你想将Unicode字符串更改为字节串? Like by encoding them as UTF8, that might work: 就像通过将它们编码为UTF8一样,这可能有效:

def to_utf8(d):
    if type(d) is dict:
        result = {}
        for key, value in d.items():
            result[to_utf8(key)] = to_utf8(value)
    elif type(d) is unicode:
        return d.encode('utf8')
    else:
        return d

Or something like that, but I don't know why you would need it. 或类似的东西,但我不知道你为什么需要它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM