简体繁体 English

维护 PyPDF2 的替代品

[英]Maintained alternatives to PyPDF2

原文 2020-07-31 22:15:56 3 2 python/ pdf/ pypdf2

I'm using the PyPDF2 library for extracting text, images, page width and heights, annotations, and other attributes from pdf documents.我正在使用PyPDF2库从 pdf 文档中提取文本、图像、页面宽度和高度、注释和其他属性。 However, the library has many bugs and issues and seems not to be maintained for a long time already.但是，该库有许多错误和问题，并且似乎已经很久没有维护了。 (edit: PyPDF2 is maintained again ) （编辑： PyPDF2 再次维护）

Is there a more vivid fork that is being maintained and developed?有没有更生动的分叉正在维护和开发？
Is there a good alternative?有没有好的选择？

From what I know, reportlab is more suitable for creating brand new pdf's (or maybe I'm just not experienced enough with reportlab).据我所知， reportlab更适合创建全新的 pdf（或者我对 reportlab 的经验不足）。

2 个解决方案

Update: PyPDF2 is maintained again - and I am the maintainer:-) I've just released a new version with several bugfixes.更新：再次维护 PyPDF2 - 我是维护者:-) 我刚刚发布了一个带有几个错误修复的新版本。

Three potential alternatives which are maintained (just like PyPDF2):维护的三个潜在替代方案（就像 PyPDF2 一样）：

pymupdf : uses mupdf (only for open source due to mypdf license ) pymupdf ：使用mupdf （由于mypdf 许可证，仅用于开源）
pikepdf : Uses qpdf pikepdf : 使用qpdf
pdfminer.six : A pure Python project. pdfminer.six ：纯 Python 项目。

I would not use:我不会使用：

PyPDF3 ( pypi ): Has less activity and probably less features than PyPDF2. PyPDF3 ( pypi )：与 PyPDF2 相比，活动更少，功能可能更少。
PyPDF4 ( pypi ): Last release on PyPI in 2018 PyPDF4 ( pypi )：2018 年 PyPI 上的最后一个版本

PyMuPDF is a Python binding for MuPDF – a lightweight PDF and XPS viewer . PyMuPDF是一个用于 MuPDF 的 Python 绑定——一个轻量级的 PDF 和 XPS 查看器。 Because MuPDF supports not only PDF but also XPS, OpenXPS, CBZ, CBR, FB2, and EPUB formats, so does PyMuPDF.因为 MuPDF 不仅支持 PDF 还支持 XPS、OpenXPS、CBZ、CBR、FB2 和 EPUB 格式，所以 PyMuPDF 也支持。 PyMuPDF is hosted on GitHub . PyMuPDF 托管在GitHub上。 We also are registered on PyPI .我们也在PyPI上注册。

Its performance stats are also very promising.它的性能统计数据也非常有希望。 Following are three sections that deal with different aspects of performance:以下是处理性能不同方面的三个部分：