简体   繁体   中英

Python pdf2image convert to path not working using sftp

I want to convert scan pdf to OCR pdf, My code working if files is in local directories but if I'm using SFTP the problem occurs. When I use convert_from_path using sftp, I got an error expected str, bytes or os.PathLike object, not SFTPFile when reading the filepath using sftp.

How to read SFTP file path in pdf2image convert_to_path?

here's my code:

for path in sftp.listdir_attr(file_path):
        file = file_path + path.filename
        f_name = path.filename
        with sftp.open(file) as my_file:
                print(my_file)
                pages = convert_from_path(my_file, 200)

The sftpfile path output should be like this, not pathlib.path:

<paramiko.sftp_file.SFTPFile object at 0x7fa22ed55250>
<paramiko.sftp_file.SFTPFile object at 0x7fa22ed55160>

Error:TypeError: expected str, bytes or os.PathLike object, not SFTPFile

I do not know pdf2image. But the convert_from_path do not seem to support file-like objects.

So either you need to use different pdf2image API (if there is one).

Or you will have to resort to either

  • Downloading the SFTP file to a temporary local file and read that.

  • Or download the SFTP file to memory:

     flo = BytesIO() sftp.getfo(file, flo) flo.seek(0) convert_from_path(flo.read(), 200)

    Note that the above is naive memory-inefficient solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM