[英]Python 3 not reading a JSON file right
I have some json files created by powershell using the ConvertTo-Json
command.我有一些由 powershell 使用
ConvertTo-Json
命令创建的 json 文件。 The content of the json file looks like json文件的内容看起来像
{
"Key1": "Value1",
"Key2": "Value2"
}
I ran the python interpreter to see if I could read the file but I get this weird output我运行了 python 解释器,看看我是否可以读取文件,但我得到了这个奇怪的输出
>>> f=open('test.json', 'r')
>>> f.read()
'ÿ\xfe{\x00\n\x00\n\x00 \x00 \x00 \x00 \x00"\x00K\x00e\x00y\x001\x00"\x00:\x00 \x00 \x00"\x00V\x00a\x00l\x00u\x00e\x001\x00"\x00,\x00\n\x00\n\x00 \x00 \x00 \x00 \x00"\x00K\x00e\x00y\x002\x00"\x00:\x00 \x00 \x00"\x00V\x00a\x00l\x00u\x00e\x002\x00"\x00\n\x00\n\x00}\x00\n\x00\n\x00'
For some reason all the characters are escaped byte characters and there's the weird ÿ
at the begninning (powershell error?).出于某种原因,所有字符都是转义字节字符,并且在开始时有奇怪的
ÿ
(powershell 错误?)。
The weird thing is this:奇怪的是这个:
>>> f=open('test.json', 'r')
>>> str=f.read()
>>> type(str)
<class 'str'>
>>> json.loads(str)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Rutvik_Choudhary\AppData\Local\Programs\Python\Python35-32\lib\json\__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "C:\Users\Rutvik_Choudhary\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\Rutvik_Choudhary\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
So the input is a string, but the json module can't parse it ( json.load(f)
return the same error).所以输入是一个字符串,但是 json 模块无法解析它(
json.load(f)
返回相同的错误)。 What is causing this error?是什么导致了这个错误? Is it a python thing, a powershell thing, a json thing?
它是一个python的东西,一个powershell的东西,一个json的东西?
As pointed out by jwodder, PowerShell has encoded your json using UTF-16LE.正如 jwodder 所指出的,PowerShell 已经使用 UTF-16LE 对您的 json 进行了编码。 To get this data into json correctly, you need to open the file using the correct encoding.
要将这些数据正确地转换为 json,您需要使用正确的编码打开文件。 eg.
例如。
with open("test.json", "r", encoding="utf16") as f:
json_string = f.read()
my_dict = json.loads(json_string)
You don't need to tell Python which variant of UTF-16 is being used.您不需要告诉 Python 正在使用哪种 UTF-16 变体。 This is the purpose of the first two bytes of the text file.
这是文本文件前两个字节的用途。 It's called a Byte Order Mark (BOM).
它被称为字节顺序标记 (BOM)。 It lets a program know if UTF-16LE or UTF-16BE has been used to encode the text file.
它让程序知道是否已使用 UTF-16LE 或 UTF-16BE 对文本文件进行编码。
If you want to load text files with Unicode BOM headers, like yours you should better use to codecs.open functions instead of open as the default open is not able to interpret the BOM.如果你想加载带有 Unicode BOM 标头的文本文件,就像你的一样,你应该更好地使用 codecs.open 函数而不是 open 因为默认的 open 无法解释 BOM。
Or you can have a look at tendo.unicode - a small library that I wrote that can improve life for people that are not used to Unicode texts.或者你可以看看tendo.unicode——我写的一个小库,可以改善不习惯Unicode文本的人的生活。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.