简体   繁体   English

如何在 urllib.request.urlopen() 中使用 FTP_TLS(显式模式)(或等效于 FTP_TLS 的 `urlopen`)

[英]How to use FTP_TLS (Explicit mode) with urllib.request.urlopen() (or equivalent of `urlopen` with FTP_TLS)

I'm using some older Python 3 code that works like this:我正在使用一些旧的 Python 3 代码,其工作方式如下:

import os
import json
import re
import csv
import urllib.request
import requests

url = "ftp://username:password@server/path-to-file.txt"

try:
    response = urllib.request.urlopen(url)
    lines = [l.decode('latin-1') for l in response.readlines()]
    rows = csv.reader(lines, delimiter=';')
    return rows
except Exception as err:
    current_app.log.error('Error when trying to read URL and parse CSV: %s' % (url))
    raise

This has always worked fine, but recently the FTP server, which I don't have any control over, switched to explicit TLS.这一直运行良好,但最近我无法控制的 FTP 服务器切换到显式 TLS。 This results in an error trace like this:这会导致如下错误跟踪:

  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 1583, in ftp_open
    raise exc.with_traceback(sys.exc_info()[2])
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 1565, in ftp_open
    fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 1586, in connect_ftp
    return ftpwrapper(user, passwd, host, port, dirs, timeout,
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 2407, in __init__
    self.init()
  File ".pyenv/versions/3.10.0/lib/python3.10/urllib/request.py", line 2417, in init
    self.ftp.login(self.user, self.passwd)
  File ".pyenv/versions/3.10.0/lib/python3.10/ftplib.py", line 412, in login
    resp = self.sendcmd('USER ' + user)
  File ".pyenv/versions/3.10.0/lib/python3.10/ftplib.py", line 281, in sendcmd
    return self.getresp()
  File ".pyenv/versions/3.10.0/lib/python3.10/ftplib.py", line 254, in getresp
    raise error_perm(resp)
urllib.error.URLError: <urlopen error ftp error: error_perm('530 Not logged in.')>

The relevant part, I think, is that the ftplib library is now unable to log in with the urllib library.我认为相关的部分是ftplib库现在无法使用urllib库登录。

For the sake of testing my ability to access the server at all, I tried using FTP_TLS like this:为了测试我访问服务器的能力,我尝试像这样使用FTP_TLS

from ftplib import FTP_TLS
ftp = FTP_TLS()
ftp.context.set_ciphers('DEFAULT@SECLEVEL=1')
ftp.connect('ftp.serverpath')
ftp.login('username','password')
ftp.close()

This works fine.这工作正常。 The server reports that I'm logged in: '230 User logged in, proceed.'服务器报告我已登录: '230 User logged in, proceed.' is the message.是消息。

So, the urllib.request.urlopen() functionality is really convenient for accessing the data I need later in the application, but the way it uses the ftplib now keeps me from logging in to the server.因此, urllib.request.urlopen()功能对于访问我稍后在应用程序中需要的数据非常方便,但是它使用ftplib的方式现在让我无法登录到服务器。 Using FTP_TLS works fine to access the server, but I'm not sure how to download the CSV once I'm logged in.使用FTP_TLS可以很好地访问服务器,但我不确定登录后如何下载 CSV。

Is there a way I can either tell urllib.request.urlopen() to use FTP_TLS , or that I can do an equivalent to quickly open the file once I'm logged in with ftplib ?有没有一种方法可以告诉urllib.request.urlopen()使用FTP_TLS ,或者我可以在使用ftplib登录后快速打开文件?

FTPS support can be added to urllib.request.urlopen by installing a new OpenerDirector with a modified FTPHandler subclass: FTPS 支持可以添加到urllib.request.urlopen通过安装具有修改的FTPHandler子类的新OpenerDirector

import csv
import io
import urllib.request


class FTPSWrapper(urllib.request.ftpwrapper):
    """
    Like urllib.request.ftpwrapper, but enforces FTPS.
    """

    def init(self):
        # This code was copied and modified from the standard library.
        # https://github.com/python/cpython/blob/f14ced6062ecdd3c654f3c558f79e1edf4f10cc8/Lib/urllib/request.py#L2412-L2419
        import ftplib

        self.busy = 0
        # Specify FTPS here
        self.ftp = ftplib.FTP_TLS()
        self.ftp.connect(self.host, self.port, self.timeout)
        self.ftp.login(self.user, self.passwd)
        # Set up a secure data connection
        self.ftp.prot_p()
        _target = "/".join(self.dirs)
        self.ftp.cwd(_target)


class FTPSHandler(urllib.request.FTPHandler):
    """
    Like urllib.request.FTPHandler, but enforces FTPS.
    """

    def connect_ftp(self, *args):
        # Use the subclass we defined above.
        return FTPSWrapper(*args, persistent=False)


def download_ftp_file(url: str):
    """
    Given a URL to a file, download it, decode it using latin-1,
    and return a text stream of its contents.
    """
    urllib.request.install_opener(urllib.request.build_opener(FTPSHandler))
    response = urllib.request.urlopen(url)
    return io.TextIOWrapper(response, encoding="latin-1")


if __name__ == "__main__":
    url = "ftp://username:password@server/path-to-file.txt"
    reader = csv.reader(download_ftp_file(url))
    print(list(reader))

The magic is the calls to build_opener and install_opener , which allow us to add new URL handling logic to urlopen .神奇的是对build_openerinstall_opener的调用,它允许我们向urlopen添加新的 URL 处理逻辑。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM