如何从网站下载 Python 文件？

Question

我需要从https://www.opensubtitles.org/de下载所有英文字幕。 Filelink 必须是这样的：（ https://www.opensubtitles.org/de/subtitleserve/sub/8429220 ）。 这是我的代码：

import requests
import validators
import sys
from bs4 import BeautifulSoup as bs
from urllib.parse import urlparse
import wget
from urllib.request import urlopen
import urllib.request 

def check_validity(my_url):
    try:
        urlopen(my_url)
        print("Valid URL")
    except IOError:
        print ("Invalid URL")
        sys.exit()


def get_srts(my_url):
    links = []
    html = urlopen(my_url).read()
    html_page = bs(html, features="lxml") 
    og_url = html_page.find("meta",  property="og:url")
    base = urlparse(my_url)
    print("base ,base")
    for link in html_page.find_all('a'):
        current_link = link.get('href')
        if current_link.endswith('srt'):
            if og_url:
                print("currentLink",current_link)
                links.append(og_url["content"] + current_link)
            else:
                links.append(base.scheme + "://" + base.netloc + current_link)

    for link in links:
        try: 
            wget.download(link)
        except:
            print(" \n \n Unable to Download A File \n")
    print('\n')


def main():
    #print("Enter Link: ")
    my_url = 'https://www.opensubtitles.org/de/search/sublanguageid-eng/searchonlymovies-on'
    check_validity(my_url)
    get_srts(my_url)

main()

问题是，我的下载器没有找到任何下载链接。 og_url 也是空的。 因为文件链接没有结尾“srt 或 zip”，我试图省略该行（如果 current_link.endswith('srt'):）。 也许你有一个想法或提示。

Answer 1

这是一个想法或提示：

好吧，在 HTML 中，点击链接即可下载某些内容，您可以在链接标签中添加下载

<a href="https://link" download>

你也许可以搜索这些。

如何从网站下载 Python 文件？

问题描述

1 个解决方案

解决方案1
0 2020-11-17 11:57:09

如何从网站下载 Python 文件？

问题描述

1 个解决方案

解决方案1 0 2020-11-17 11:57:09

解决方案1
0 2020-11-17 11:57:09