简体   繁体   中英

ValueError: No JSON object could be decoded using Python

I have a valid json that I'm unable to read using Python and getting the error,

ValueError: No JSON object could be decoded using Python 

The code is as following,

import json, requests

page = "http://www.zillow.com/search/GetResults.htm?spt=homes&status=110001&lt=001000&ht=111111&pr=,&mp=,&bd=2%2C&ba=0%2C&sf=,&lot=,&yr=,&pho=0&pets=0&parking=0&laundry=0&income-restricted=0&pnd=0&red=0&zso=0&days=any&ds=all&pmf=1&pf=1&zoom=3&rect=-134340820,16594081,-56469727,54952386&p=1&sort=globalrelevanceex&search=maplist&disp=1&listright=true&isMapSearch=true&zoom=3"
response = requests.get(page) # request the json file
json_response =  json.loads(response.text) # parse the json file

When I open the URL in the browser, I was able to see the JSON file properly and can validate using the website: http://codebeautify.org/jsonviewer . What's the issue here ?

When I use print response.text , I get the following output:

u'<html><head><title>Zillow: Real Estate, Apartments, Mortgage &amp; Home Values in the US</title><meta http-equiv="X-UA-Compatible" content="IE=8, IE=9"/><meta name="ROBOTS" content="NOINDEX, NOFOLLOW"/><link href="//fonts.googleapis.com/css?family=Open+Sans:400&subset=latin" rel="stylesheet" type="text/css"/><link href="http://www.zillowstatic.com/vstatic/9520695/static/css/z-pages/captcha.css" type="text/css" rel="stylesheet" media="screen"/><script language="javascript">\n            function onReCaptchaLoad() {\n                window.reCaptchaLoaded = true;\n            }\n\n            window.setTimeout(function () {\n                if (!window.reCaptchaLoaded) {\n                   document.getElementById(\'norecaptcha\').value = true;\n                   document.getElementById(\'captcha-form\').submit();\n                }\n            }, 5000);\n        </script></head><body><main class="zsg-layout-content"><div class="error-content-block"><div class="error-text-content"><!-- <h1>Captcha</h1> --><h5>Enter the characters in the images to continue.</h5><div id="content" class="captcha-container"><form method="POST" action="" id="captcha-form"><script type="text/javascript">\r\nvar RecaptchaOptions = {"theme":"white","lang":"en-US"};\r\n</script>\r\n<script type="text/javascript" src="http://api.recaptcha.net/challenge?k=6Lf2nvMSAAAAAMQ5p6WlAfDEixMdOQgJsij-3_ud" onload="onReCaptchaLoad()"></script>\r\n<br/><input id="dest" name="dest" type="hidden" value="ognl:originalDestination"/><input id="norecaptcha" name="norecaptcha" type="hidden" value="false"/><button type="submit" class="zsg-button zsg-button_primary">Submit</button></form><img src="http://www.zillowstatic.com/static/logos/logo-65x14.png" width="65" alt="Zillow" height="14"></img></div></div></div></main></body></html><!-- H:049  T:0ms  S:1494  R:Thu May 26 23:12:41 PDT 2016  B:5.0.29554-release_20160512-lunar_lander.6d4c099~candidate.d23c8e0 -->'

So, it seems that I'm not getting JSOn from the server, while, the link open is JSON in the browser (Chrome)

If you are using requests as a lib, you can do the following:

import requests as req

page = "http://www.zillow.com/search/GetResults.htm?spt=homes&status=110001&lt=001000&ht=111111&pr=,&mp=,&bd=2%2C&ba=0%2C&sf=,&lot=,&yr=,&pho=0&pets=0&parking=0&laundry=0&income-restricted=0&pnd=0&red=0&zso=0&days=any&ds=all&pmf=1&pf=1&zoom=3&rect=-134340820,16594081,-56469727,54952386&p=1&sort=globalrelevanceex&search=maplist&disp=1&listright=true&isMapSearch=true&zoom=3"

json_response = req.get(page).json()

print type(json_response)

>> <type 'dict'>

Here you go !

EDIT: Death-Stalker is right, you hit the website too many times, this is why you're not getting the page you request, but the code is fine, what I showed is way of simplifying it. Unless you use another IP address to query the website, I can't see a solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM