为什么我不能用python从谷歌下载图片？

Question

The code helped me download bunch of images from google.代码帮助我从谷歌下载了一堆图片。 It used to work a few days back and now all of the sudden the code breaks.几天前它曾经可以工作，现在突然代码中断了。

Code :代码：

# importing google_images_download module 
from google_images_download import google_images_download  

# creating object 
response = google_images_download.googleimagesdownload()  

search_queries = ['Apple', 'Orange', 'Grapes', 'water melon'] 


def downloadimages(query): 
    # keywords is the search query 
    # format is the image file format 
    # limit is the number of images to be downloaded 
    # print urs is to print the image file url 
    # size is the image size which can 
    # be specified manually ("large, medium, icon") 
    # aspect ratio denotes the height width ratio 
    # of images to download. ("tall, square, wide, panoramic") 
    arguments = {"keywords": query, 
                 "format": "jpg", 
                 "limit":4, 
                 "print_urls":True, 
                 "size": "medium", 
                 "aspect_ratio": "panoramic"} 
    try: 
        response.download(arguments) 

    # Handling File NotFound Error     
    except FileNotFoundError:  
        arguments = {"keywords": query, 
                     "format": "jpg", 
                     "limit":4, 
                     "print_urls":True,  
                     "size": "medium"} 

        # Providing arguments for the searched query 
        try: 
            # Downloading the photos based 
            # on the given arguments 
            response.download(arguments)  
        except: 
            pass

# Driver Code 
for query in search_queries: 
    downloadimages(query)  
    print()

Output log:输出日志：

Item no.: 1 --> Item name = Apple Evaluating... Starting Download...项目编号：1 --> 项目名称 = Apple 正在评估...开始下载...

Unfortunately all 4 could not be downloaded because some images were not downloadable.不幸的是，由于某些图像无法下载，因此无法下载所有 4 个。 0 is all we got for this search filter! 0 是我们为这个搜索过滤器得到的全部！

Errors: 0错误：0

Item no.: 1 --> Item name = Orange Evaluating... Starting Download...项目编号：1 --> 项目名称 = 橙色正在评估...开始下载...

Unfortunately all 4 could not be downloaded because some images were not downloadable.不幸的是，由于某些图像无法下载，因此无法下载所有 4 个。 0 is all we got for this search filter! 0 是我们为这个搜索过滤器得到的全部！

Errors: 0错误：0

Item no.: 1 --> Item name = Grapes Evaluating... Starting Download...项目编号：1 --> 项目名称 = 葡萄正在评估...开始下载...

Unfortunately all 4 could not be downloaded because some images were not downloadable.不幸的是，由于某些图像无法下载，因此无法下载所有 4 个。 0 is all we got for this search filter! 0 是我们为这个搜索过滤器得到的全部！

Errors: 0错误：0

Item no.: 1 --> Item name = water melon Evaluating... Starting Download...项目编号：1 --> 项目名称 = 西瓜正在评估...开始下载...

Unfortunately all 4 could not be downloaded because some images were not downloadable.不幸的是，由于某些图像无法下载，因此无法下载所有 4 个。 0 is all we got for this search filter! 0 是我们为这个搜索过滤器得到的全部！

Errors: 0错误：0

This actually create a folder but no images in it.这实际上创建了一个文件夹，但其中没有图像。

Answer 1

google_images_download project is no longer seems compatible wrt Google APIs. google_images_download项目似乎不再与 Google API 兼容。

As an alternative you can try simple_image_download .作为替代方案，您可以尝试simple_image_download 。

Answer 2

It looks like there is an issue with the package.看起来包裹有问题。 See these open PRs:PR1 andPR2查看这些公开的 PR：PR1和PR2

Answer 3

I think Google is changing the DOM.我认为 Google 正在改变 DOM。 The element class="rg_meta notranslate" is no longer exist.元素 class="rg_meta notranslate" 不再存在。 It is changed to class="rg_i ..."改为 class="rg_i ..."


def get_soup(url,header):
    return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header)),'html.parser')    

def main(args):
    query = "typical face"
    query = query.split()
    query = '+'.join(query)
    url = "https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch"
    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
    soup = get_soup(url, headers)
    for a in soup.find_all("img", {"class": "rg_i"}):
        wget.download(a.attrs["data-iurl"], a.attrs["data-iid"])


if __name__ == '__main__':
    from sys import argv
    try:
        main(argv)
    except KeyboardInterrupt:
        pass
    sys.exit()

Answer 4

Indeed the issue has appeared not so long ago, there are already a bunch of similar Github issues:确实这个问题不久前就出现了，Github上已经有一堆类似的问题了：

Unfortunately, there is no official solution, for now, you could use the temporary solution that was provided in the discussions.不幸的是，目前还没有官方解决方案，您可以使用讨论中提供的临时解决方案。

Answer 5

The reason this doesn't work is because google changed the way they do everything so that you now need the api_key included in the search string.这不起作用的原因是因为谷歌改变了他们做所有事情的方式，所以你现在需要包含在搜索字符串中的 api_key 。 As a result of this packages such as google-images-download no longer work even if you use the 2.8.0 version because they have no placeholder to insert the api_key string which you must register with Google to get your 2500 free downloads per day.因此，即使您使用 2.8.0 版本，诸如 google-images-download 之类的软件包也不再起作用，因为它们没有占位符来插入 api_key 字符串，您必须向 Google 注册才能获得每天 2500 次免费下载。

If you are willing to pay $50 per month or more to access a service from serpapi.com , one way to do this is to use the pip package google-search-results and provide your api_key as part of the query params.如果您愿意每月支付 50 美元或更多以访问来自serpapi.com的服务，一种方法是使用 pip 包google-search-results并提供您的 api_key 作为查询参数的一部分。

params = {
           "engine" : "google",
           ...
           "api_key" : "secret_api_key" 
}

where you provide your API key yourself and then call:您自己提供 API 密钥，然后调用：

client = GoogleSearchResults(params)
results = client.get_dict()

This returns a JSON string with the link to all the image urls and then you just download them directly.这将返回一个带有所有图像 URL 链接的 JSON 字符串，然后您只需直接下载它们。

为什么我不能用python从谷歌下载图片？

问题描述

5 个解决方案

解决方案1
4 2020-08-20 17:55:19

解决方案2
1 2020-02-10 14:08:35

解决方案3
1 2020-02-20 09:16:14

解决方案4
1 已采纳 2020-03-01 17:44:22

解决方案5
0 2020-04-14 13:54:09

为什么我不能用python从谷歌下载图片？

问题描述

5 个解决方案

解决方案1 4 2020-08-20 17:55:19

解决方案2 1 2020-02-10 14:08:35

解决方案3 1 2020-02-20 09:16:14

解决方案4 1 已采纳 2020-03-01 17:44:22

解决方案5 0 2020-04-14 13:54:09

解决方案1
4 2020-08-20 17:55:19

解决方案2
1 2020-02-10 14:08:35

解决方案3
1 2020-02-20 09:16:14

解决方案4
1 已采纳 2020-03-01 17:44:22

解决方案5
0 2020-04-14 13:54:09