![](/img/trans.png)
[英]How to read a JSON file in Azure Databricks from Azure Data Lake Store
[英]Parse json file downloaded from Azure data lake
我从 azure 数据湖下载了一个文件,格式如下:
{"PartitionKey":"2020-10-05","value":"Resolved"...}
{"PartitionKey":"2020-10-06","value":"Resolved"...}
我只想阅读和解析 python 中的这个。
def read_ods_file():
file_path = 'temp.json'
data = []
with open(file_path) as f:
for line in f:
data.append(json.loads(line))
这给了我例外:
data.append(json.loads(line))
File "C:\python3.6\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\python3.6\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\python3.6\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
打印行会在开头显示这些添加的字符。 这些添加的字符是什么?
{"PartitionKey":"2020-10-05","value":"Resolved"...}
{"PartitionKey":"2020-10-06","value":"Resolved"...}
微软使用各种奇怪的字符。 您可以尝试使用string.printable
来只获取正常的 ASCII 字符,如下所示:
您设置的f
变量
with open(file_path) as f:
是一个 python 文件 object(类型为_io.TextIOWrapper
)。 如果你想把每一行读成 json object,你应该尝试这样的事情:
with open(file_path) as f:
# read the file contents into a string
# strip off trailing whitespace
# split string into list of strings on \n character
for line in f.read().strip().splitlines():
data.append(json.loads(line))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.