简体   繁体   English

尝试在 python 中解码 HTTP 响应。 无法弄清楚 JSON 解码

[英]Trying to decode HTTP Response in python. Can't figure out JSON decoding

Here's the basic request:这是基本要求:

req = urllib2.Request(f"https://www.voter.ie/api/search/name/{name}/surname/{surname}/eircode/{eircode}/lang/en")

req.add_header("Connection", "keep-alive")
req.add_header("Accept", "application/json, text/plain, */*")
req.add_header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36 OPR/62.0.3331.99")
req.add_header("Accept-Encoding", "gzip, deflate, br")
req.add_header("Accept-Language", "en-US,en;q=0.9")

response = urllib2.urlopen(req)

Here are the headers, I can see that it's JSON in Content-Type and encoding is utf-8 :这是标题,我可以看到它是Content-Type的 JSON 并且编码是utf-8

response.getheaders()

[('Transfer-Encoding', 'chunked'),
 ('Content-Type', 'application/json; charset=utf-8'),
 ('Content-Encoding', 'gzip'),
 ('Vary', 'Accept-Encoding'),
 ('Server', 'Kestrel'),
 ('Request-Context', 'appId=cid-v1:25017a8d-4490-471a-a8d0-e9e17860f987'),
 ('Strict-Transport-Security', 'max-age=2592000'),
 ('X-Content-Type-Options', 'nosniff'),
 ('Referrer-Policy', 'no-referrer'),
 ('X-XSS-Protection', '1; mode=block'),
 ('X-Frame-Options', 'Deny'),
 ('X-Powered-By', 'ASP.NET'),
 ('Date', 'Fri, 02 Aug 2019 14:45:33 GMT'),
 ('Connection', 'close')]

So when I try to read it or decode it I am getting many errors, but first of all this is what it looks like.因此,当我尝试阅读或解码它时,我会遇到很多错误,但首先这就是它的样子。 I haven't posted the full string as it's too long, but this is a sample:我没有发布完整的字符串,因为它太长了,但这是一个示例:

response.read()

b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\xed\xbd\x07`\x1cI\x96%&/m\xca{\x7fJ\xf5J\xd7\xe0t\xa1\x08\x80`\x13$\xd8\x90@\x10\xec\xc1\x88\xcd\xe6\x92\xec\x1diG#)\xab*\x81\xcaeVe]f\x16@\xcc\xed\x9d\xbc\xf7\xde{\xef\xbd\xf7\xde{\xef\xbd\xf7\xba;\x9dN\'\xf7\xdf\xff?\\fd\x01l\xf6\xceJ\xda\xc9\x9e!\x80\xa

What I've tried using methods I've found on here StackOverflow:我尝试使用的方法在这里 StackOverflow 上找到了:

response.read().decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte


raw_data = response.read()
json.loads(raw_data.decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte


string = response.read().decode('utf-8')
json_obj = json.loads(string)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

What am I doing wrong?我究竟做错了什么?

As the response headers hint, the data has been compressed with gzip.正如响应标头所暗示的那样,数据已使用 gzip 压缩。 You need to decompress it before doing anything else.你需要在做任何其他事情之前解压它。

import gzip, json
gz = response.read()
j = gzip.decompress(gz)
data = json.loads(j.decode('utf-8')) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我一直在 python 中达到最大递归深度。找不到解决方案 - I keep hitting maximum recursion depth in python. can't figure out a solution 学习用Python编写代码。 无法弄清楚如何使用基于UDP有效负载的条件 - Learning to code in Python. Can't figure out how to use conditionals based on UDP payload 无法解码 HTTP 响应 JSON (Python3) - Cant decode HTTP Response to JSON (Python3) 无法解码Python中Javascript的JSON响应 - Can't decode JSON response from Javascript in Python 无法找出json元素的结果 - can't figure out json elements result 在python中制作游戏时发生属性错误。 我不知道发生了什么 - Attribute error while making a game in python. I can not figure out what is happening 无法弄清楚为什么将JSON值导入DF时出错 - Can't figure out why I get error when trying to get JSON value into DF 我是 Python 的新手。 而且这个递归程序往往因为一些我无法弄清楚的奇怪原因而无法运行。 任何帮助将不胜感激 - I am new to Python. And this recursive program tends not to run for some strange reason I can't figure out. Any help would be grateful 无法找出python查询。 - Can't figure out python query. 无法找出Python中的if / else语法 - Can't figure out if/else syntax in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM