[英]Downloading images from Google Search using Python gives error?
Here is my code: 这是我的代码:
import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson
# Define search term
searchTerm = "parrot"
# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')
# Start FancyURLopener with defined version
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11)Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()
# Set count to 0
count= 0
for i in range(0,10):
# Notice that the start changes for each iteration in order to request a new set of images for each loop
url = ('https://ajax.googleapis.com/ajax/services/search/images?' + 'v=1.0&q='+searchTerm+'&start='+str(i*10)+'&userip=MyIP')
print url
request = urllib2.Request(url, None, {'Referer': 'testing'})
response = urllib2.urlopen(request)
# Get results using JSON
results = simplejson.load(response)
data = results['responseData']
dataInfo = data['results']
# Iterate for each result and get unescaped url
for myUrl in dataInfo:
count = count + 1
my_url = myUrl['unescapedUrl']
myopener.retrieve(myUrl['unescapedUrl'],str(count)+'.jpg')
But after downloading some images I am getting following error: 但是下载一些图像后,出现以下错误:
Traceback (most recent call last): File "C:\Python27\img_google3.py", line 37, in dataInfo = data['results'] TypeError: 'NoneType' object has no attribute 'getitem'
What could be causing this? 是什么原因造成的?
I have to download images from Google, as a part of training neural networks for image classification. 我必须从Google下载图像,作为训练神经网络进行图像分类的一部分。
The error message tells you that results['responseData'] == None
. 错误消息告诉您
results['responseData'] == None
。 You need to look at what you actually get in results
(eg print(results)
) to figure out how to access the data you want. 您需要查看实际获得的
results
(例如print(results)
),以弄清楚如何访问所需的数据。
I get the following when your error occurs: 发生您的错误时,我得到以下信息:
{u'responseData': None, # hence the error
u'responseDetails': u'out of range start', # what went wrong
u'responseStatus': 400} # http response code for "Bad request"
Eventually you load a url (ie https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=parrot&start=90&userip=MyIP
) where the search results simply don't go that high. 最终,您加载了一个URL(即
https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=parrot&start=90&userip=MyIP
),在该URL中搜索结果根本就没有那么高。 I get a sensible content in results
for lower numbers: ...&start=0&...
. 对于较低的数字,我在
results
得到了有意义的内容: ...&start=0&...
You need to check whether you get anything back, eg: 您需要检查是否收到任何东西,例如:
if results["responseStatus"] == 200:
# response was OK, do your thing
Also, you could make your url-building code simpler and save on the string concatenation: 另外,您可以简化网址构建代码,并保存在字符串连接中:
template = 'https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q={}&start={}&userip=MyIP'
url = template.format(searchTerm, str(i * 10))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.