简体   繁体   English

未找到 EOF 标记 - 如何在 PyPDF 和 PyPDF2 中修复?

[英]EOF marker not found - How to fix in PyPDF and PyPDF2?

I'm attempting to combine a few PDF files into a single PDF file using Python.我正在尝试使用 Python 将几个 PDF 文件组合成一个 PDF 文件。 I've tried both PyPDF and PyPDF2 - on some files, they both throw this same error:我已经尝试过 PyPDF 和 PyPDF2 - 在某些文件上,它们都抛出了同样的错误:

PdfReadError: EOF marker not found PdfReadError:未找到 EOF 标记

Here's my code (page_files) is a list of PDF file paths to combine:这是我的代码 (page_files) 是要组合的 PDF 文件路径列表:

from PyPDF2 import PdfReader, PdfWriter

writer = PdfWriter()
for path in ["example1.pdf", "example2.pdf"]:
    reader = PdfReader(path)            
    for page in reader.pages:
        writer.add_page(page)            

with open("out.pdf", "wb") as fp:
    writer.write(fp)

I've read a few StackOverflow threads on the topic, but none contain a solution that works.我已经阅读了一些关于该主题的 StackOverflow 线程,但没有一个包含有效的解决方案。 If you've successfully combined PDF files using Python, I'd love to hear how.如果您已使用 Python 成功合并 PDF 文件,我很想听听如何。

You were running in an issue of PyPDF2 which was solved with PR #321 .您遇到的 PyPDF2 问题已通过PR #321解决。 The fix was released in PyPDF2==1.27.8 (released on 2022-04-21).该修复程序在PyPDF2==1.27.8中发布(发布于 2022-04-21)。

Is there is still someone looking for merging a "list" of pdfs:是否还有人在寻找合并 pdf 的“列表”:

Note: Using glob to get the correct filelist.注意:使用 glob 获取正确的文件列表。 <- this will really safe your day ^^ <- 这真的会让你的一天很安全^^

Check this out: glob module reference看看这个: glob 模块参考

from PyPDF2 import PdfFileMerger, PdfFileReader, PdfFileWriter
import os
import glob

class MergeAllPDF:
    def __init__(self):
        self.mergelist = []

    def create(self, filepath, outpath, outfilename):
        self.outfilname = outfilename
        self.filepath = filepath
        self.outpath = outpath
        self.pdfs = glob.glob(self.filepath)
        self.myrange = len(self.pdfs)

        for _ in range(self.myrange):
            if self.pdfs:
                self.mergelist.append(self.pdfs.pop(0))
        self.merge()

    def merge(self):
        if self.mergelist:
            self.merger = PdfFileMerger()
            for pdf in self.mergelist:
                self.merger.append(open(pdf, 'rb'))  
            self.merger.write(self.outpath + "%s.pdf" % (self.outfilname))
            self.merger.close()
            self.mergelist = []
        else:
            print("mergelist is empty please check your input path")

# example how to use
#update your path here:


inpath = r"C:\Users\Fabian\Desktop\mergeallpdfs\scan\*.pdf" #here are your single page pdfs stored
outpath = r"C:\Users\Fabian\Desktop\mergeallpdfs\output\\" #here your merged pdf will be stored

b = MergeAllPDF()
b.create(inpath, outpath, "mergedpdf")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM