修改 url 参数以从多个网站下载图像

Question

I was trying to download images from all the cases included in CaseIDs array, but it doesn't work.我试图从 CaseIDs 数组中包含的所有案例中下载图像，但它不起作用。 I want code to run for all cases.我希望代码在所有情况下都能运行。

from bs4 import BeautifulSoup
import requests as rq
from urllib.parse import urljoin
from tqdm import tqdm

CaseIDs = [100237, 99817, 100271]

with rq.session() as s:
    for caseid in tqdm(CaseIDs):
        url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'
        r = s.get(url)
        soup = BeautifulSoup(r.text, "html.parser")

        url = urljoin(url, soup.find('a', text='Text and Images Only')['href'])
        r = s.get(url)
        soup = BeautifulSoup(r.text, "html.parser")

        links = [urljoin(url, i['src']) for i in soup.select('img[src^="GetBinary.aspx"]')]

        count = 0
        for link in links:
            content = s.get(link).content
            with open("test_image" + str(count) + ".jpg", 'wb') as f:
                f.write(content)
            count += 1

Answer 1

You need to use an f-string to pass your caseId value in, as you're trying to do:您需要使用 f 字符串来传递caseId值，就像您尝试执行的那样：

url = f'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID= {caseid}'

(You probably also need to remove the space between the = and the { ) （您可能还需要删除=和{之间的空格）

Answer 2

尝试使用format()像这样：

url = 'https://crashviewer.nhtsa.dot.gov/nass-CIREN/CaseForm.aspx?xsl=main.xsl&CaseID={}'.format(caseid)

修改 url 参数以从多个网站下载图像

问题描述

2 个解决方案

解决方案1
2 2020-01-20 08:57:53

解决方案2
2 已采纳 2020-01-20 09:00:51

修改 url 参数以从多个网站下载图像

问题描述

2 个解决方案

解决方案1 2 2020-01-20 08:57:53

解决方案2 2 已采纳 2020-01-20 09:00:51

解决方案1
2 2020-01-20 08:57:53

解决方案2
2 已采纳 2020-01-20 09:00:51