ast.literal_eval will convert unicode code point from \uxxxx to \\uxxxx, how to avoid?

Question

For example, here is the code to handle this json file

json.loads(u"\"{\\\"title\\\": \\\"\\\\u5927\\\"}\"")

json.loads will convert it to a unicode string, see below

{"title": "\u5927"}

here is the code to handle unicode string

ast.literal_eval(json.loads(u"\"{\\\"title\\\": \\\"\\\\u5927\\\"}\""))

ast.literal_eval will convert it to a dictionary, see below

{'title': '\\u5927'}

But what I want is a dictionary with below content

{'title': '\u5927'}

Answer 1

json.loads("{\\"title\\": \\"\\\大\\"}") will return a dictionary, so you don't need the ast.literal_eval at all.

d = json.loads("{\"title\": \"\\u5927\"}")

print d
{u'title': u'\u5927'}

type(d)
Out[2]: dict

For the full json.loads() json to python conversion, please see this .

If you're trying to parse a file, use json.load() without the s like this:

with open('your-file.json') as f:
    # you can change the encoding to the one you need
    print json.load(f, encoding='utf-8')

Test:

from io import StringIO

s = StringIO(u"{\"title\": \"\\u5927\"}")

print json.load(s)
{u'title': u'\u5927'}

Update

OP has totally changed what the json should be parsed, here is another solution, parse the json again:

json.loads(json.loads(u"\"{\\\"title\\\": \\\"\\\\u5927\\\"}\""))
Out[6]: {u'title': u'\u5927'}

This is because the first json.loads convert the string (non-json) to a json string, parse it again with json.loads will deserialize it eventually.

ast.literal_eval will convert unicode code point from \uxxxx to \\uxxxx, how to avoid?

Question

1 answers

solution1
0 ACCPTED 2015-04-20 02:33:36

Update

ast.literal_eval will convert unicode code point from \uxxxx to \\uxxxx, how to avoid?

Question

1 answers

solution1 0 ACCPTED 2015-04-20 02:33:36

Update

solution1
0 ACCPTED 2015-04-20 02:33:36