如何在脚本中使用 PyPDF2？

Question

import PyPDF2
from PyDF2 import PdfFileReader, PdfFileWriter


file_path="sample.pdf"

pdf = PdfFileReader(file_path)


with open("sample.pdf", "w") as f:'

for page_num in range(pdf.numPages):
   
   pageObj = pdf.getPage(page_num)



   try:
       txt = pageObj.extractText()
       txt = DocumentInformation.author

   except:
       pass

   else:

       f.write(txt)
f.close()

收到错误：ModuleNotFoundError：没有名为“PyPDF2”的模块

编写我的第一个脚本，我想在 PDF 中扫描，然后提取文本并将其写入 txt 文件。 我试图使用 pyPDF2，但我不确定如何在这样的脚本中使用它。

编辑：我成功地导入了操作系统和系统。

import os
import sys

Answer 1

有多个问题：

from PyDF2 import ... ：一个错字。 你的意思是PyPDF2而不是PyDF2
PdfFileWriter已导入，但从未使用过（旁注：它是最新版本的 PyPDF2 中的 PdfReader 和 PdfWriter）
with open("sample.pdf", "w") as f:' : 语法错误
缺乏下一行的缩进
旁注：你知道你可以简单地for page in pdf.pages写吗？
DocumentInformation.author是错误的。 我猜你的意思是pdf.metadata.author
您覆盖了txt变量-我不明白为什么在重新分配它之前不使用它。

也许这就是你想要的：

from PyPDF2 import PdfReader

def get_text(pdf_file_path: str) -> str:
    text = ""
    reader = PdfReader(pdf_file_path)
    for page in reader.pages:
        text += page.extract_text()
    return text


text = get_text("example.pdf")

with open("example.txt", "w") as f:
    f.write(text)

安装问题

如果您有安装问题，也许安装 PyPDF2 的文档可以帮助您？

如果您在控制台中以python your_script_name.py执行脚本，您可能需要检查

python -c "import PyPDF2; print(PyPDF2.__version__)"

那应该显示您的 PyPDF2 版本。 如果没有，则说明您使用的 Python 环境没有安装 PyPDF2。 请注意，您的系统可能有任意多个 Python 环境。

如何在脚本中使用 PyPDF2？

问题描述

1 个解决方案

解决方案1
1 2022-06-01 20:26:42

安装问题

如何在脚本中使用 PyPDF2？

问题描述

1 个解决方案

解决方案1 1 2022-06-01 20:26:42

安装问题

解决方案1
1 2022-06-01 20:26:42