Python 脚本与 BeautifulSoup 下载空 zip 文件

Question

I am trying to collectively retrieve hydrological measurement data using Python. Unfortunately, I get empty zip files downloaded every time.我正在尝试使用 Python 集体检索水文测量数据。不幸的是，我每次下载的 zip 文件都是空的。

here goes my code, done based on some YT tutorials:这是我的代码，基于一些 YT 教程完成：

from bs4 import BeautifulSoup
import requests

domain = "https://danepubliczne.imgw.pl/"
URL = 'https://danepubliczne.imgw.pl/data/dane_pomiarowo_obserwacyjne/dane_hydrologiczne/dobowe/2021/'
filetype = '.zip'


def get_soup(url):
    return BeautifulSoup(requests.get(url).text, 'html.parser')


for link in get_soup(URL).find_all('a'):
    zip_link = link.get('href')
    if filetype in zip_link:
        print(zip_link)
        with open(link.text, 'wb') as file:
            response = requests.get(domain + zip_link)
            file.write(response.content)

Answer 1

It looks like the issue is here:看起来问题出在这里：

response = requests.get(domain + zip_link)

The domain variable is the base url of the web site, but the links are relative to the sub directory.域变量是web站点的base url，但是链接是相对于子目录的。 If you add如果你添加

print(domain + zip_link)

You can see that you get something like https://danepubliczne.imgw.pl/codz_2021_01.zip你可以看到你得到了类似https://danepubliczne.imgw.pl/codz_2021_01.zip 的东西

It looks like what you want is response = requests.get(URL + zip_link)看起来你想要的是response = requests.get(URL + zip_link)

Python 脚本与 BeautifulSoup 下载空 zip 文件

问题描述

1 个解决方案

解决方案1
1 2023-01-15 00:29:03

Python 脚本与 BeautifulSoup 下载空 zip 文件

问题描述

1 个解决方案

解决方案1 1 2023-01-15 00:29:03

解决方案1
1 2023-01-15 00:29:03