如何使用 python 中的 pdf2image 將 pdf 從 url 轉換為圖像？

Question

I am able to convert pdf file in my drive to images using pdf2image convert_to_path but when I try the same for pdf ' https://example.com/abc.pdf ', end up with multiple errors.

代碼：

url = 'https://example.com/abc.pdf'
scrape = urlopen(url)  # for external files
pil_images = pdf2image.convert_from_bytes(scrape.read(), dpi=200, 
             output_folder=None, first_page=None, last_page=None,
             thread_count=1, userpw=None,use_cropbox=False, strict=False,
             poppler_path=r"C:\poppler-0.68.0_x86\poppler-0.68.0\bin",)

錯誤：

   Unable to get page count. Syntax Error: Document stream is empty

也跟着下面的鏈接，但沒有運氣

Python3：下載 PDF 到 memory 並將第一頁轉換為圖像

身份驗證屏幕截圖：

Answer 1

按照本博客中的說明，首先從 URL 下載 pdf。 https://dzone.com/articles/simple-examples-of-downloading-files-using-python

如果您在 pdf 中有多個頁面，則使用此將 pdf 轉換為圖像或任何其他系列格式。

import ghostscript

def pdf2jpeg(pdf_input_path, jpeg_output_path):
    args = ["pdf2jpeg", # actual value doesn't matter
            "-dNOPAUSE",
            "-sDEVICE=jpeg",
            "-r144",
            "-sOutputFile=" + jpeg_output_path,
            pdf_input_path]
    ghostscript.Ghostscript(*args)

參考：用 Python 將一個 PDF 轉換為一系列圖像

對於身份驗證，試試這個。

import os
import requests

from urlparse import urlparse

username = 'foo'
password = 'sekret'

url = 'http://example.com/blueberry/download/somefile.jpg'
filename = os.path.basename(urlparse(url).path)

r = requests.get(url, auth=(username,password))

if r.status_code == 200:
   with open(filename, 'wb') as out:
      for bits in r.iter_content():
          out.write(bits)

參考：使用 Python 下載提供用戶名和密碼的文件

如何使用 python 中的 pdf2image 將 pdf 從 url 轉換為圖像？

問題描述

1 個解決方案

解決方案1
1 已采納 2019-10-29 08:59:30

如何使用 python 中的 pdf2image 將 pdf 從 url 轉換為圖像？

問題描述

1 個解決方案

解決方案1 1 已采納 2019-10-29 08:59:30

解決方案1
1 已采納 2019-10-29 08:59:30