简体   繁体   中英

EOF marker not found - How to fix in PyPDF and PyPDF2?

I'm attempting to combine a few PDF files into a single PDF file using Python. I've tried both PyPDF and PyPDF2 - on some files, they both throw this same error:

PdfReadError: EOF marker not found

Here's my code (page_files) is a list of PDF file paths to combine:

from PyPDF2 import PdfReader, PdfWriter

writer = PdfWriter()
for path in ["example1.pdf", "example2.pdf"]:
    reader = PdfReader(path)            
    for page in reader.pages:
        writer.add_page(page)            

with open("out.pdf", "wb") as fp:
    writer.write(fp)

I've read a few StackOverflow threads on the topic, but none contain a solution that works. If you've successfully combined PDF files using Python, I'd love to hear how.

You were running in an issue of PyPDF2 which was solved with PR #321 . The fix was released in PyPDF2==1.27.8 (released on 2022-04-21).

Is there is still someone looking for merging a "list" of pdfs:

Note: Using glob to get the correct filelist. <- this will really safe your day ^^

Check this out: glob module reference

from PyPDF2 import PdfFileMerger, PdfFileReader, PdfFileWriter
import os
import glob

class MergeAllPDF:
    def __init__(self):
        self.mergelist = []

    def create(self, filepath, outpath, outfilename):
        self.outfilname = outfilename
        self.filepath = filepath
        self.outpath = outpath
        self.pdfs = glob.glob(self.filepath)
        self.myrange = len(self.pdfs)

        for _ in range(self.myrange):
            if self.pdfs:
                self.mergelist.append(self.pdfs.pop(0))
        self.merge()

    def merge(self):
        if self.mergelist:
            self.merger = PdfFileMerger()
            for pdf in self.mergelist:
                self.merger.append(open(pdf, 'rb'))  
            self.merger.write(self.outpath + "%s.pdf" % (self.outfilname))
            self.merger.close()
            self.mergelist = []
        else:
            print("mergelist is empty please check your input path")

# example how to use
#update your path here:


inpath = r"C:\Users\Fabian\Desktop\mergeallpdfs\scan\*.pdf" #here are your single page pdfs stored
outpath = r"C:\Users\Fabian\Desktop\mergeallpdfs\output\\" #here your merged pdf will be stored

b = MergeAllPDF()
b.create(inpath, outpath, "mergedpdf")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM