简体   繁体   中英

why couldn't I download images from google with python?

The code helped me download bunch of images from google. It used to work a few days back and now all of the sudden the code breaks.

Code :

# importing google_images_download module 
from google_images_download import google_images_download  

# creating object 
response = google_images_download.googleimagesdownload()  

search_queries = ['Apple', 'Orange', 'Grapes', 'water melon'] 


def downloadimages(query): 
    # keywords is the search query 
    # format is the image file format 
    # limit is the number of images to be downloaded 
    # print urs is to print the image file url 
    # size is the image size which can 
    # be specified manually ("large, medium, icon") 
    # aspect ratio denotes the height width ratio 
    # of images to download. ("tall, square, wide, panoramic") 
    arguments = {"keywords": query, 
                 "format": "jpg", 
                 "limit":4, 
                 "print_urls":True, 
                 "size": "medium", 
                 "aspect_ratio": "panoramic"} 
    try: 
        response.download(arguments) 

    # Handling File NotFound Error     
    except FileNotFoundError:  
        arguments = {"keywords": query, 
                     "format": "jpg", 
                     "limit":4, 
                     "print_urls":True,  
                     "size": "medium"} 

        # Providing arguments for the searched query 
        try: 
            # Downloading the photos based 
            # on the given arguments 
            response.download(arguments)  
        except: 
            pass

# Driver Code 
for query in search_queries: 
    downloadimages(query)  
    print()

Output log:

Item no.: 1 --> Item name = Apple Evaluating... Starting Download...

Unfortunately all 4 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

Item no.: 1 --> Item name = Orange Evaluating... Starting Download...

Unfortunately all 4 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

Item no.: 1 --> Item name = Grapes Evaluating... Starting Download...

Unfortunately all 4 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

Item no.: 1 --> Item name = water melon Evaluating... Starting Download...

Unfortunately all 4 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

This actually create a folder but no images in it.

google_images_download project is no longer seems compatible wrt Google APIs.

As an alternative you can try simple_image_download .

It looks like there is an issue with the package. See these open PRs:PR1 andPR2

I think Google is changing the DOM. The element class="rg_meta notranslate" is no longer exist. It is changed to class="rg_i ..."


def get_soup(url,header):
    return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header)),'html.parser')    

def main(args):
    query = "typical face"
    query = query.split()
    query = '+'.join(query)
    url = "https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch"
    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
    soup = get_soup(url, headers)
    for a in soup.find_all("img", {"class": "rg_i"}):
        wget.download(a.attrs["data-iurl"], a.attrs["data-iid"])


if __name__ == '__main__':
    from sys import argv
    try:
        main(argv)
    except KeyboardInterrupt:
        pass
    sys.exit()

Indeed the issue has appeared not so long ago, there are already a bunch of similar Github issues:

Unfortunately, there is no official solution, for now, you could use the temporary solution that was provided in the discussions.

The reason this doesn't work is because google changed the way they do everything so that you now need the api_key included in the search string. As a result of this packages such as google-images-download no longer work even if you use the 2.8.0 version because they have no placeholder to insert the api_key string which you must register with Google to get your 2500 free downloads per day.

If you are willing to pay $50 per month or more to access a service from serpapi.com , one way to do this is to use the pip package google-search-results and provide your api_key as part of the query params.

params = {
           "engine" : "google",
           ...
           "api_key" : "secret_api_key" 
}

where you provide your API key yourself and then call:

client = GoogleSearchResults(params)
results = client.get_dict()

This returns a JSON string with the link to all the image urls and then you just download them directly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM