[英]Parsing HTTP array response with python
我正在尝试通过json解析HTTP响应,但它给了我字符错误,但是当我尝试通过for循环遍历此响应时,它将所有内容拆分为单个字符。 有没有更好的方法来解析此响应?
码:
_url = self.MAIN_URL
try:
_request = self.__webSession.get(_url, cookies=self.__cookies)
if _request.status_code != 200:
self.log("Request failed with code: {}. URL: {}".format(_request.status_code, _url))
return
except Exception as err:
self.log("[e4] Web-request error: {}. URL: {}".format(err, _url))
return
_text = _request.json()
json.loads()返回以下内容
Expecting value: line 1 column 110 (char 109)
HTTP响应需要解析:
[
[
9266939,
'Value1',
'Value2',
'Value3',
,
'Value4',
[
[
'number',
'number2',
[
'value',
,
'value2'
]
]
]
],
[
5987798,
'Value1',
'Value2',
,
'Value3',
'Value4',
[
[
'number',
'number2',
[
'value',
'value2'
]
]
]
]
]
尽管由于行号和列号而使错误消息令人困惑,但是JSON格式在任何情况下都不接受字符串的单引号,因此给定的HTTP响应不是JSON格式。 您必须对字符串使用双引号。
因此,您必须改为这样输入(如果您可以控制它):
[
[
9266939,
"Value1",
"Value2",
"Value3",
"Value4",
[
[
"number",
"number2",
[
"value",
"value2"
]
]
...
如果您无法控制要解析的HTTP响应,则可以在解析之前将所有单引号替换为双引号:
http_response_string = (get the HTTP response)
adjusted_http_response_string = http_response_string.replace("'", '"')
data = json.loads(adjusted_http_response_string)
但这当然带有替换单引号(或单引号)的潜在风险,这些单引号不是字符串分隔符。 但是,它在大多数时间都可以充分解决问题。
编辑:
根据注释中的要求进一步清理:
http_response_string = (get the HTTP response)
# More advanced replacement of ' with ", expecting
# strings to always come after at least four spaces,
# and always end in either comma, colon, or newline.
adjusted_http_response_string = \
re.sub("( )'", r'\1"',
re.sub("'([,:\n])", r'"\1',
http_response_string))
# Replacing faulty ", ," with ",".
adjusted_http_response_string = \
re.sub(",(\s*,)*", ",",
adjusted_http_response_string)
data = json.loads(adjusted_http_response_string)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.