[英]PDFminer in Python
我下載了pdfminer,命令行方法可以正常工作,但是我希望能夠同時轉換多個pdf文檔,因此我試圖將pdfminer用作庫,但我發現了這個os stackoverflow,但我無法將其獲取到工作..
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from cStringIO import StringIO
def convert_pdf(path):
rsrcmgr = PDFResourceManager()
retstr = StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
fp = file(path, 'rb')
process_pdf(rsrcmgr, device, fp)
fp.close()
device.close()
str = retstr.getvalue()
retstr.close()
print str
convert_pdf("/Users/gorkemyurtseven/Desktop/casino.pdf")
當我運行它時,我得到:
Traceback (most recent call last):
File "pdfminer.py", line 1, in <module>
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
File "/Users/gorkemyurtseven/Desktop/pdfminer.py", line 1, in <module>
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
ImportError: No module named pdfinterp
似乎您正在將腳本pdfminer
稱為模塊,並且在嘗試導入具有相同名稱的模塊時會發瘋。
另一個原因可能是pdfminer
模塊安裝不正確或不是您的python發行版的正確版本。
如本文所述,問題在於您的文件名為pdfminer.py
。
更改名稱並刪除__pycache__/
目錄和pdfminer.pyc
文件:
$ rm -r __pycache__/ pdfminer.pyc
$ mv pdfminer.py mypdfminer.py
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.