[英]Python Paramiko UTF-8 error when trying to stream file from SFTP server
I have a program in which I use Paramiko to get files from SFTP server.我有一个程序,我在其中使用 Paramiko 从 SFTP 服务器获取文件。 Originally I was pulling the file locally with get
and then processing through the file by opening the local copy.最初我使用get
在本地提取文件,然后通过打开本地副本来处理文件。 However, I am trying to avoid the get
and just read the file as a stream.但是,我试图避免get
并仅将文件作为流读取。 This is working fine until I encounter characters that are not UTF-8 - such as <96>.这工作正常,直到我遇到不是 UTF-8 的字符 - 例如 <96>。 The program gets an exception when this happens.发生这种情况时,程序会出现异常。 The problem is occurring on the line:问题出现在线路上:
for line in remote_file
So I am not able to get the data from the stream.所以我无法从流中获取数据。 I have seen mention of decoding and re-encoding but I don't see any way to be able to do this since I am not being given the data by Paramiko.我已经看到提到解码和重新编码,但我没有看到任何方法可以做到这一点,因为 Paramiko 没有向我提供数据。
Is there a Paramiko parameter that says what to do or provides some way to just get the raw data?是否有 Paramiko 参数说明要做什么或提供某种方法来获取原始数据? How do I get around this issue?我该如何解决这个问题?
Below is the code being processed - the first 3 lines establish the connection.下面是正在处理的代码 - 前 3 行建立连接。 Then I have some code (not shown) where I filter through the directory find a list of files about which I care.然后我有一些代码(未显示),我在其中过滤目录,找到我关心的文件列表。 The next to last line opens a connection to the file on the SFTP server.倒数第二行打开与 SFTP 服务器上文件的连接。 The last line is where the error occurs - I have a try
block around the whole block of code.最后一行是发生错误的地方 - 我在整个代码块周围有一个try
块。 When the exception is hit the error that is returned is当异常被击中时,返回的错误是
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 124: invalid start byte UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 124 中的字节 0x96:起始字节无效
ftpTransport = paramiko.Transport((FTPSERVER, FTPPORT))
ftpTransport.connect(username=FTPUSERNAME, password=FTPPASSWORD)
sftp = paramiko.SFTPClient.from_transport(ftpTransport)
remote_file = sftp.open(remoteName)
for line in remote_file:
I do not get the UTF-8 error if I do a sftp.get
and then open the local file.如果我执行sftp.get
然后打开本地文件,我不会收到 UTF-8 错误。 For now I have changed my code to take that step but would prefer not copying the file locally if I don't have to.现在我已经更改了我的代码以采取该步骤,但如果我不需要,我宁愿不要在本地复制文件。
Paramiko assumes that all text files are UTF-8 and uses "strict" decoding (aborting on any error). Paramiko 假定所有文本文件都是 UTF-8 并使用“严格”解码(在出现任何错误时中止)。
To workaround that, you can open the file in "binary" mode.要解决此问题,您可以以“二进制”模式打开文件。 Then, the next()
, readline()
and similar, will return "binary string", which you can decode using any encoding you like, or decode using UTF-8 ignoring errors:然后, next()
、 readline()
和类似的将返回“二进制字符串”,您可以使用您喜欢的任何编码对其进行解码,或者使用忽略错误的 UTF-8 进行解码:
remote_file = sftp.open(remoteName, "rb")
for line in remote_file:
print(line.decode("utf8", "ignore"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.