简体   繁体   English

无法加载包含转义序列的json

[英]Unable to load json containing escape sequences

I'm being passed some Json and am having trouble parsing it. 我通过了一些Json,无法解析它。

The object is currently simple with a single key/value pair. 该对象当前很简单,只有一个键/值对。 The key works fine but the value \\d causes issues. 密钥可以正常工作,但是\\d值会引起问题。

This is coming from an html form, via javascript. 这来自html表单,通过javascript。 All of the below are literals. 以下所有都是文字。

  • Html: \\d HTML: \\d
  • Javascript: {'Key': '\\d'} Javascript: {'Key': '\\d'}
  • Json: {"Key": "\\\\d"} 杰森: {"Key": "\\\\d"}

json.loads() doesn't seem to like Json in this format. json.loads()似乎不喜欢这种格式的Json。 A quick sanity check that I'm not doing anything silly works fine: 快速检查一下我没有做任何愚蠢的事情,可以正常工作:

>>> import json
>>> json.loads('{"key":"value"}')
{'key': 'value'}

Since I'm declaring this string in Python, it should escape it down to a literal of va\\\\lue - which, when parsed as Json should be va\\lue . 由于我是在Python中声明此字符串,因此应将其转义为va\\\\lue的文字-在解析为Json时应为va\\lue

>>> json.loads('{"key":"va\\\\lue"}')
{'key': 'va\\lue'}

In case python wasn't escaping the string on the way in, I thought I'd check without the doubling... 万一python 没有在转义中转义字符串,我想我应该检查而不加倍...

>>> json.loads('{"key":"va\\lue"}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33\lib\json\__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "C:\Python33\lib\json\decoder.py", line 352, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Python33\lib\json\decoder.py", line 368, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid \escape: line 1 column 11 (char 10)

but it fails, as expected. 但失败了,正如预期的那样。

I can't see any way to parse Json field that should contain a single backslash after all the unescaping has taken place. 在完成所有转义后,我看不到任何解析Json字段的方法,该字段应包含一个反斜杠。

How can I get Python to deserialize this string literal {"a":"val\\\\ue\u0026quot;} (which is valid Json ) into the appropriate python representation: {'a': 'val\\ue\u0026#39;} ? 如何获得Python将字符串文字{"a":"val\\\\ue\u0026quot;}有效的Json )反序列化为相应的python表示形式: {'a': 'val\\ue\u0026#39;}

As an aside, it doesn't help that PyDev is inconsistent with what representation of a string it uses. 顺便说一句,PyDev与它使用的字符串的表示形式不一致并没有帮助。 The watch window shows double backslashes, the tooltip of the variable shows quadruple backslashes. 监视窗口显示两个反斜杠,变量的工具提示显示四个反斜杠。 I assume that's the "If you were to type the string, this is what you'd have to use for it to escape to the original" representation, but it's by no means clear. 我假设这是“如果要键入字符串,则必须使用它来转义为原始字符串”表示形式,但这还不是很清楚。

Edit to follow on from @twalberg's answer... 编辑以遵循@twalberg的答案...

>>> input={'a':'val\ue'}
  File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec cant decode bytes in position 3-5: truncated \uXXXX escape
>>> input={'a':'val\\ue'}
>>> input
{'a': 'val\\ue'}
>>> json.dumps(input)
'{"a": "val\\\\ue"}'
>>> json.loads(json.dumps(input))
{'a': 'val\\ue'}
>>> json.loads(json.dumps(input))['a']
'val\\ue'

Using json.dumps() to see how json would represent your target string: 使用json.dumps()查看json如何表示您的目标字符串:

>>> orig = { 'a' : 'val\ue' }
>>> jstring = json.dumps(orig)
>>> print jstring
{"a": "val\\ue"}
>>> extracted = json.loads(jstring)
>>> print extracted
{u'a': u'val\\ue'}
>>> print extracted['a']
val\ue
>>> 

This was in Python 2.7.3, though, so it may be only partially relevant to your Python 3.x environment. 但是,这是在Python 2.7.3中进行的,因此它可能与Python 3.x环境仅部分相关。 Still, I don't think JSON has changed that much... 不过,我认为JSON的变化不大...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM