I am trying to write a python script which will take a bunch of urls one at a time and fetch the response content of that url and store it as json files.
Here is what I wrote initially to get the response of the url
def download_json()
params={'id':00163E0BD0C1FA89,
'list':'141',
'queue': 'gen',
'type': 'abc_stat'
}
req_obj= requests.get(link, params=params)
print(req_obj.url)
print(req_obj.status_code)
return req_obj
It creates the right url as when I copy the url directly in browser it shows me the output in json format. Here is one row of json output i am seeing on the browser:
{
"DATA" : [
{
"SCHEMA" : "abc_4_QAATu2.",
"ID" : "QAATu2",
"IM_ID" : "22faba86_c9e0_4dbc",
"S_NUMBER" : "502379284",
"CONFIG_TYPE" : "las_home_type",
"CONFIG_KEY" : "las_home_key",
"CONFIG_LONG_V" : "1",
"CONFIG_STRING_V" : "https://abc-deg/development",
"MODIFIED_DATE" : "Unknown"
},
So this does show that data is returned in json format when I enter the url in browser directly.
However my requests object has this for headers:
Out[26]:
{'content-length': '15457', 'expires': '0', 'content-encoding': 'gzip', 'cache-control': 'no-cache, no-store, private', 'set-cookie': 'login-XSRF_RZA=2018051-axJnifQUpOnrS8WCFI; path=/abc/deo/cpo; secure; HttpOnly, usercontext=client=002; path=/', 'content-type': 'text/html; charset=utf-8', 'pragma': 'no-cache, no-store, private'}
Now when I do requests.json() to get the data in json python object I get the following error
JSONDecodeError Traceback (most recent call last)
<ipython-input-28-4cfc1a694fcf> in <module>()
----> 1 req_obj.json()
/Users/anaconda/envs/dl/lib/python3.5/site-packages/requests/models.py in json(self, **kwargs)
890 # used.
891 pass
--> 892 return complexjson.loads(self.text, **kwargs)
893
894 @property
/Users/anaconda/envs/dl/lib/python3.5/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
317 parse_int is None and parse_float is None and
318 parse_constant is None and object_pairs_hook is None and not kw):
--> 319 return _default_decoder.decode(s)
320 if cls is None:
321 cls = JSONDecoder
/Users/anaconda/envs/dl/lib/python3.5/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):
/Users/anaconda/envs/dl/lib/python3.5/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end
JSONDecodeError: Expecting value: line 2 column 1 (char 2)
EDIT:
The content_type if you see in above headers is shown as html even when on browser it shows json as output
However when I do
req_obj.json
<bound method Response.json of <Response [200]>>
But req_obj.json() gives below error.
Any idea why it is not able to convert the data into json format when output is actually in json format as shown above? Thanks
According to the documentation :
In case the JSON decoding fails,
r.json()
raises an exception. For example, if the response gets a 204 (No Content), or if the response contains invalid JSON, attemptingr.json()
raisesValueError: No JSON object could be decoded.
Although it's not throwing the same error message, the cause appears to be the same: you're probably not getting JSON as an answer, which would explain why it JSONDecode
throws an exception.
You should be able to confirm this by printing req_obj.text
instead of using req_obj.json()
.
As for how to fix it, I suspect that there must be something different between the request you're making using the browser and the one you're making using Python (such as different parameters).
I suggest you read this to further investigate the source of the problem.
According to this document: http://docs.python-requests.org/en/master/
You could check the req_obj.status_code
and r.headers['content-type']
. If the status_code is 200 and the content type is 'application/json; charset=utf8'
'application/json; charset=utf8'
then you can try to check for req_obj.json()
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.