bs4 Python.css文件怎么下载

Question

Hello I am trying to make a scraper that saves all.css files from a web page in a folder but when running my script I get this error: with open(shit, 'wb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'https://url.com/cache/themes/theme1/index.min.css'您好，我正在尝试制作一个抓取工具，将 web 页面中的所有 css 文件保存在一个文件夹中，但是在运行我的脚本时出现此错误： with open(shit, 'wb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'https://url.com/cache/themes/theme1/index.min.css'

And here is my code:这是我的代码：

from bs4 import BeautifulSoup
import requests
import os

proxies = {
    'http': 'socks5h://127.0.0.1:9050',
    'https': 'socks5h://127.0.0.1:9050'
}

url = "https://url.com"
folder = "Files"
resp = requests.get(url, proxies=proxies)
soup = BeautifulSoup(resp.text, features='lxml')

def Downloader(url, folder):
    os.mkdir(os.path.join(os.getcwd(), folder))
    os.chdir(os.path.join(os.getcwd(), folder))   
    
    css = soup.find_all('link', rel="stylesheet")
    for cum in css:
        shit = cum['href']
        if "http://" in shit:
            with open(shit, 'wb') as f:
                piss = requests.get(shit, proxies=proxies)
                f.write(piss.content)

Downloader(url=url, folder=folder)

Does anyone know what the issue might be?有谁知道问题可能是什么？ Thank you <3谢谢<3

Answer 1

You are trying to write to file with a name that has / in it.您正在尝试写入名称中包含/的文件。 This identifies it as part a directory/folder.这将其标识为目录/文件夹的一部分。 So either remove/replace those, or build in the logic that creates those folders structure, and it'll write to file.因此，要么删除/替换它们，要么构建创建这些文件夹结构的逻辑，它会写入文件。

import requests
from bs4 import BeautifulSoup
import os

proxies = {
    'http': 'socks5h://127.0.0.1:9050',
    'https': 'socks5h://127.0.0.1:9050'
}

url = "https://url.com"
folder = "Files"
resp = requests.get(url, proxies=proxies)
soup = BeautifulSoup(resp.text, features='lxml')

def Downloader(url, folder):
    try:
        os.mkdir(os.path.join(os.getcwd(), folder))
    except Exception as e:
        print(e)
    os.chdir(os.path.join(os.getcwd(), folder))   
    
    css = soup.find_all('link', rel="stylesheet")
    for each in css:
        href = each['href']
        if "http://" in href or "https://" in href:
            filename = href.split('//')[-1].replace('/','_').replace('.','_')
            with open(filename, 'wb') as f:
                response = requests.get(href, proxies=proxies)
                f.write(response.content)

Downloader(url=url, folder=folder)

bs4 Python.css文件怎么下载

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-04-28 12:07:37

bs4 Python.css文件怎么下载

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-04-28 12:07:37

解决方案1
0 已采纳 2022-04-28 12:07:37