I have a program in which I use Paramiko to get files from SFTP server. Originally I was pulling the file locally with get
and then processing through the file by opening the local copy. However, I am trying to avoid the get
and just read the file as a stream. This is working fine until I encounter characters that are not UTF-8 - such as <96>. The program gets an exception when this happens. The problem is occurring on the line:
for line in remote_file
So I am not able to get the data from the stream. I have seen mention of decoding and re-encoding but I don't see any way to be able to do this since I am not being given the data by Paramiko.
Is there a Paramiko parameter that says what to do or provides some way to just get the raw data? How do I get around this issue?
Below is the code being processed - the first 3 lines establish the connection. Then I have some code (not shown) where I filter through the directory find a list of files about which I care. The next to last line opens a connection to the file on the SFTP server. The last line is where the error occurs - I have a try
block around the whole block of code. When the exception is hit the error that is returned is
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 124: invalid start byte
ftpTransport = paramiko.Transport((FTPSERVER, FTPPORT))
ftpTransport.connect(username=FTPUSERNAME, password=FTPPASSWORD)
sftp = paramiko.SFTPClient.from_transport(ftpTransport)
remote_file = sftp.open(remoteName)
for line in remote_file:
I do not get the UTF-8 error if I do a sftp.get
and then open the local file. For now I have changed my code to take that step but would prefer not copying the file locally if I don't have to.
Paramiko assumes that all text files are UTF-8 and uses "strict" decoding (aborting on any error).
To workaround that, you can open the file in "binary" mode. Then, the next()
, readline()
and similar, will return "binary string", which you can decode using any encoding you like, or decode using UTF-8 ignoring errors:
remote_file = sftp.open(remoteName, "rb")
for line in remote_file:
print(line.decode("utf8", "ignore"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.