简体   繁体   中英

Can't decode correctly JSON URL from Python

I want to read some data from a JSON url, however, with my code i don't get a JSON structure, instead i get a string undecoded.

I've also tried reading the data directly from the url using pandas read_json , but i get this error message: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 136430: character maps to <undefined> .

This is the main code i'm using:

from urllib.request import urlopen
import json

response = urlopen("https://data.nasa.gov/resource/y77d-th95.json")
json_data = response.read().decode('utf8','replace')

And this is what i get in json_data :

'\x1f�\x08\x00\x00\x00\x00\x00\x00\x00̽K��ȕ&��_ልR����\n
i�MOfd*3+#��RI5]�F��:�:��%Ct�\x1b�ѨE�\x06�3jP��l\x06�\x02\n�A�\n
\x02��ƌ��<f4�c��Ũ^�P����^�����__u�S��o^}Y���\x15y�n��Gտ������\x0f�T��
\x1f�WCs\x7f��\x0f\x07�Gor��?��=�\x7f�m�߫\x7f��F�\x1f�ꥩ\x07�G�l�Q��\x7f�e
\x7f3��&˲��}}T\x7f%�6e�O\x7f�w\x0f�O�M&9��\x0f\x1f�~���Ƕ�^��
\x7f}u�I?�mwT��}�\x0f۶����%��"F�J��?���B\x00�aw:\x18�\x0cE�]1!,Yv\x13b�S
\x0cb��\x06�\x04�f\x1b�\x130\x1a9b�:0�ƀ,P��|\'&�4+Ͽ�\x16P�\x01\x15
\x1bF�����)��o���\x12�Z�\x18�\x0e����i\x7f�_wl�b5"��\x01�+*n#.
\x0b\x041-6r����TI�� 1J]Ļv��\x1b��8�7p\x0b���f�ʮ9ߨE�-m!6U�\x02t\x14$W�
\x0e���]��\x1f\'�ղ�,\x18�n��\x15���\r5�7�-�F�,��#Z�\x0b�î]\x7f�?l��
\x17�c�5���"��\r_��h`y\x05\x06X���m\\�Ӛ/\x01l�Q�\x00\x7f�\x1e\x1a^E\\���Y
膒T�@=7T�)h\x02\u038b\x18\x11�\x1b��To�\t�\\t\\i\x11zr8���9�\x14�-
�.�G�*HF�3���^}����+`AK\x1c0�+D\x00/��f\x1b鿟���)���E�
\x03��Oݪٯ�<\x0e��p��յ�b�����Wo���\x13|zC\x0fo�mk�b�\r�\x15�3��ˌ���Y�
\x18�\x0e����0!\x16��5\x13A�,ǀV�a8\x01�\x1b�b-^ĈQ�.�Ь\x0f�a���o^�
\x1a�(�?v�]�����,�\x0bY\x19��|;�\rO9�\x171b�:8\x1f���2�f���2�\u03a2�
\x0eS\x1bm~\x1b�|����Q�\x18�.Ȼzxw\x02�\x16��,�X��\x1d.q��\x1chY�\x19O�
\x1c1:]����aZO\x1e������̱���j�RnDʦ����N\x17���~�P����Cߑ��\x7f�rB�
\x07��Jb~\x1d�|x\x05��\x14k��\x11�ց�_��n�kH�\x07���ٺ�M\x11Z�\x12��b#<�
\x15\x1c�E��"G��\x19�\x7f���q7�0\'`P����,˰�U\x1ekQd�;��f,\x12\x0e�(G�N
\x17�������{:�,��p����r��\x1d�\x16|��\x11��ũn��xu�E���7�y�\x06\\�,
\x04�,"�\x16��$�t��l`#GF�3��p�.�<�H\x02k:Z\x18�o�\x14j\x01\x1b�M�Ģ�{�#l
\x06�e�l����ǿ\x0eu7!���7��A��OO��\x11,n�\x02�\x15�O�|#
\x19u�k��(O�"j1b��ط���p���""\\\x1ay5)[\x023�s�)=!\'��Blg9bԺ`�w�3[��\x04�
\x0e9\x0b!-9��\x16��\n^�rCS\x0c\n#G��\x19�á�C;��\x02��`p���m�
.
.
.

Any idea what i'm doing wrong?

trying to use json.loads directly on the result of read with default decode gives me a valid list,

try this please:

from urllib.request import urlopen
import json

response = urlopen("https://data.nasa.gov/resource/y77d-th95.json")
json_data = json.loads(response.read().decode())

print(json_data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM