繁体   English   中英

如何从 url 列表中抓取图像并将其下载到本地文件夹?

[英]How to I scrape images from a list of urls and download it to local folder?

我正在遍历图像的 url 列表。

import requests

list_url = ["www.abc.com/def.jpg", " www.abc.com/def1.jpg",... "www.abc.com/def100000.jpg"]

correct_img_list = []

for img in list_url:

    request = requests.get(img)
    if request.status_code == 200:
        correct_img_list.append(img)
        continue
    else:
        continue
i = 1 

for img in correct_img_list:
    urllib.request.urlretrieve(img, "local_image file_" + str(i))
    i += 1

我想要它通过列表到 go,抓取 url 然后将图像下载到本地目录。

我看到的错误是这样的:

Traceback (most recent call last):
  File "c:/Users/liuh9/Desktop/google chrome folder/image-downloader.py", line 714, in <module>
    urllib.request.urlretrieve(img, "local_image file " + str(i))
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 531, in open
    response = meth(req, response)
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 569, in error
    return self._call_chain(*args)
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Users\liuh9\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

非常感谢!

你得到的响应码是404,说明URL不存在。

作为一个好的做法,请确保您检查响应代码并仅在响应代码为 200 时执行进一步的操作。我还建议在 try except 块中添加操作,以便程序不会在第一个无效的 URL 上崩溃。

看下面的例子

import urllib.request


url_list = ["https://cdn.pixabay.com/photo/2016/09/07/11/37/tropical-1651426__340.jpg", "https://image.shutterstock.com/image-photo/long-exposure-soft-colorful-sunset-260nw-142504771.jpg"]
filename = 1

for url in url_list:
    try:
        urllib.request.urlretrieve(url, f'{filename}.jpg')
        filename += 1
    except Exception as exc:
        print(f"Exception occued while downloading image from url {url} {str(exc)}")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM