简体   繁体   中英

Google image download with python cannot download images

I'm using google_images_download library to download top 20 images for a keyword. It's worked perfectly when I'm using it last days. Code is as follows.

from google_images_download import google_images_download

response = google_images_download.googleimagesdownload()

arguments = {"keywords":keyword,"limit":10,"print_urls":True}
paths = response.download(arguments)

Now it gives following error.

Evaluating...
Starting Download...


Unfortunately all 10 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Errors: 0

How can I solve this error.

There has been some changes on Google end (how they respond to the request) which results in this issue. Joeclinton1 on github has done some modifications to the original repo which provides a temporary fix.

You can find the updated repo here: https://github.com/Joeclinton1/google-images-download.git . The solution is in patch-1 branch if I'm not mistaken.

  1. First uninstall the current version of google_images_download.

  2. Then manually install Joeclinton1's repo by:

git clone https://github.com/Joeclinton1/google-images-download.git
cd google-images-download && sudo python setup.py install #no need for 'sudo' on windows Anaconda environment

or to install it with pip

pip install git+https://github.com/Joeclinton1/google-images-download.git

This should solve the problem. Note that currently this repo only supports upto 100 images.

I faced the same issue with google-image-download, which used to work perfect earlier! I have an alternative that I would like to suggest, which should solve the problem.

Solution: Instead of using google-image-download for Python, use bing-image-downloader, that downloads from Bing! search engine.

Steps:

Step 1: Install the library by using: pip install bing-image-downloader

Step 2:

from bing_image_downloader import downloader
downloader.download(query_string, limit=100,  output_dir='dataset', 
adult_filter_off=True, force_replace=False, timeout=60)

That's it! All you would need to do is to add your image topic to the query_string.

Note:

Parameters that you can further tweak:

query_string : String to be searched.

limit : (optional, default is 100) Number of images to download.

output_dir : (optional, default is 'dataset') Name of output dir.

adult_filter_off : (optional, default is True) Enable of disable adult filteration.

force_replace : (optional, default is False) Delete folder if present and start a fresh download.

timeout : (optional, default is 60) timeout for connection in seconds.

Further Reference: https://pypi.org/project/bing-image-downloader/

Another easy way to download any number of images :-

pip install simple_image_download

from simple_image_download import simple_image_download as simp

response = simp.simple_image_download response().download(a, b)

Where a= string of subject you want to download B= number of images you want to download

If you want to download less than 100 images per query string, google-images-download will work better than bing-images-downloader . It handles the errors better and, you know, Google Images gives quite better results than Bing equivalent.

However, if you're trying to download more than 100 images, google-images-downloader will give you a lot of headaches. As mentioned in this answer , Google changed their end, and because of this the repo is having a lot of failures (more info on the situation statushere ).

So, if you want to download thousands of images, use bing-image-downloader :

Install package from pip

pip install bing-image-downloader

Run query.

NOTE: The documentation seems to be incorrect, as it returns a "No module found" error when importing the package as from bing_image_downloader import downloader (as mentioned in this answer ). Import it and use it like this:

from bing_image_downloader.downloader import download

query_string = 'muscle cars'

download(query_string, limit=1000,  output_dir='dataset', adult_filter_off=True, force_replace=False, timeout=60, verbose=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM