简体   繁体   English

Python 脚本与 BeautifulSoup 下载空 zip 文件

[英]Python script with BeautifulSoup downloading empty zip files

I am trying to collectively retrieve hydrological measurement data using Python. Unfortunately, I get empty zip files downloaded every time.我正在尝试使用 Python 集体检索水文测量数据。不幸的是,我每次下载的 zip 文件都是空的。

here goes my code, done based on some YT tutorials:这是我的代码,基于一些 YT 教程完成:

from bs4 import BeautifulSoup
import requests

domain = "https://danepubliczne.imgw.pl/"
URL = 'https://danepubliczne.imgw.pl/data/dane_pomiarowo_obserwacyjne/dane_hydrologiczne/dobowe/2021/'
filetype = '.zip'


def get_soup(url):
    return BeautifulSoup(requests.get(url).text, 'html.parser')


for link in get_soup(URL).find_all('a'):
    zip_link = link.get('href')
    if filetype in zip_link:
        print(zip_link)
        with open(link.text, 'wb') as file:
            response = requests.get(domain + zip_link)
            file.write(response.content)

It looks like the issue is here:看起来问题出在这里:

response = requests.get(domain + zip_link)

The domain variable is the base url of the web site, but the links are relative to the sub directory.域变量是web站点的base url,但是链接是相对于子目录的。 If you add如果你添加

print(domain + zip_link)

You can see that you get something like https://danepubliczne.imgw.pl/codz_2021_01.zip你可以看到你得到了类似https://danepubliczne.imgw.pl/codz_2021_01.zip 的东西

It looks like what you want is response = requests.get(URL + zip_link)看起来你想要的是response = requests.get(URL + zip_link)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM