[英]Python write valid json with newlines to file
Valid json expects escaped newline characters to be encoded as '\\\\n', with two backslashes. 有效的json期望转义的换行符被编码为'\\\\ n',并带有两个反斜杠。 I have data that contains newline characters that I want to save to a file.
我有包含我想要保存到文件的换行符的数据。 Here's a simplified version:
这是一个简化版本:
data = {'mystring': 'Line 1\nLine 2'}
I can encode it with json.dumps(): 我可以使用json.dumps()对其进行编码:
import json
json_data = json.dumps(data)
json_data
# -> '{"mystring": "Line 1\\nLine 2"}'
When I print it, the newline displays as '\\n', not '\\\\n' (which I find odd but I can live with): 当我打印它时,换行符显示为'\\ n',而不是'\\\\ n'(我觉得奇怪,但我可以忍受):
print(json_data)
# -> {"mystring": "Line 1\nLine 2"}
However (here's the problem) when I output it to a file, the content of the file no longer contains valid json: 但是(这是问题)当我将其输出到文件时,文件的内容不再包含有效的json:
f = open('mydata.json', 'w')
f.write(json_data)
f.close()
If I open the file and read it, it contains this: 如果我打开文件并阅读它,它包含:
{"mystring": "Line 1\nLine 2"}
but I was hoping for this: 但我希望这样:
{"mystring": "Line 1\\nLine 2"}
Oddly (I think), if I read the file using python's open(), the json data is considered valid: 奇怪(我想),如果我使用python的open()读取文件,json数据被认为是有效的:
f = open('mydata.json', 'r')
json_data = f.read()
f.close()
json_data
# -> '{"mystring": "Line 1\\nLine 2"}'
... and it decodes OK: ......它解码好了:
json.loads(json_data)
# -> {u'mystring': u'Line 1\nLine 2'}
My question is why is the data in the file not valid json ? 我的问题是为什么文件中的数据无效json ? If I need another - non Python - application to read it it would probably be incorrect.
如果我需要另一个 - 非Python - 应用程序来读取它可能是不正确的。 If I copy and paste the file contents and use json.loads() on it it fails:
如果我复制并粘贴文件内容并在其上使用json.loads()则失败:
import json
json.loads('{"mystring": "Line 1\nLine 2"}')
# -> ValueError: Invalid control character at: line 1 column 21 (char 20)
Can anybody explain if this is the expected behaviour or am I doing something wrong? 有人可以解释这是预期的行为还是我做错了什么?
You ran into the pitfall of neglecting the fact that the \\
character is also an escape sequence character in Python. 你遇到了忽略这样一个事实的陷阱:
\\
字符也是Python中的转义序列字符。 Try printing out the last example instead of calling json.loads
: 尝试打印出最后一个示例,而不是调用
json.loads
:
>>> print('{"mystring": "Line 1\nLine 2"}')
{"mystring": "Line 1
Line 2"}
No way the above is valid JSON. 以上都不是有效的JSON。 What if the
\\
character is correctly encoded? 如果
\\
_字符编码正确怎么办?
>>> print('{"mystring": "Line 1\\nLine 2"}')
{"mystring": "Line 1\nLine 2"}
Much better, you can then: 好多了,你可以:
>>> json.loads('{"mystring": "Line 1\\nLine 2"}')
{'mystring': 'Line 1\nLine 2'}
Alternatively, if you really appreciate being able to copy some text from some other buffer and paste it into your live interpreter to do decode, you may consider using the r
aw modifier for your string: 或者,如果您真的希望能够从其他缓冲区复制一些文本并将其粘贴到您的实时解释器中进行解码,您可以考虑使用
r
aw修饰符作为您的字符串:
>>> print(r'{"mystring": "Line 1\nLine 2"}')
{"mystring": "Line 1\nLine 2"}
>>> json.loads(r'{"mystring": "Line 1\nLine 2"}')
{'mystring': 'Line 1\nLine 2'}
See that the \\
is no longer automatically escaping with the newline. 看到
\\
不再使用换行符自动转义。
Also see: How do I handle newlines in JSON? 另请参阅: 如何处理JSON中的换行符? and note how this is not a problem that exists strictly within Python.
并注意这不是一个严格存在于Python中的问题。
The reason for this: 原因如下:
print(json_data)
# -> {"mystring": "Line 1\nLine 2"}
Is that \\\\
is a valid escape sequence that ends up as a single backslash \\
when trying to print it. 是
\\\\
是一个有效的转义序列,在尝试打印时会以单个反斜杠\\
结尾。
The data in the json file is valid, as the parser is able to parse it :) json文件中的数据是有效的,因为解析器能够解析它:)
The confusion stems from the fact that when you try to print a string with escape sequences those get interpreted. 这种混淆源于这样一个事实:当你尝试打印带有转义序列的字符串时,会解释它们。 And the sequence
\\\\n
is interpreted as \\n
并且序列
\\\\n
被解释为\\n
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.