Issue horizontal flipping PDF with PyPDF2 only on some PDFs

Question

I need to write a program for quickly and easiy flipping a PDF horizontally for construction base blueprints that sometimes need built as a mirror image of the way the architect designed it depending on the way the house goes on the lot. I have yet to find reliable software that does it quickly besides big design software style GUIs like Adobe Acrobat (which my boss is unwilling to learn).

I have written a simple python program that works on some PDFs, but others it flips incorrectly. It seems that it is able to correctly flip PDFs that are just made of pictures, but when it goes to flip a PDF that has text, it flips both horizontal and vertical so it ends up mirrored and flipped upside down as well.

This is my first time using PyPDF2, but I've learned that the addTransformation method uses a transformation matrix placed in a tuple shaped like this:

[scale x, skew x, skew y, scale y, translate x, translate y]

The overall code is bigger so let me know if you need to see the function that lets the user pick a file, but here is the function that flips the PDF that the problem seems to be in:

def horizontalPDF(filename):


    pdfFileObj = open(filename[0], 'rb')
    pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
    pdfWriter = PyPDF2.PdfFileWriter()


    for pageNum in range (pdfReader.numPages):
        page = pdfReader.getPage(pageNum)
        box = page.mediaBox
        page.addTransformation([-1,0,0,1,box[2],0])
        pdfWriter.addPage(page)

    f = filedialog.asksaveasfilename(title = "Save As",  defaultextension=".pdf", filetypes = (("pdf","*.pdf"),("all files","*.*")))

    pdfOutput = open(f, 'wb')
    pdfWriter.write(pdfOutput)
    pdfOutput.close()

The main issue must be with the addTransformation method but I can't figure out why it would transform some pdfs correctly but others incorrectly. The box[2] is to translate the whole thing over the width of the entire PDF because after flipping it would have flipped across the origin and no longer on the "canvas".

Interestingly, on the PDFs that it doens't flip correctly, if I use the vertical translate matrix, [1,0,0,-1,0,box[3]], it actually works the way I want it to (horizontally), but on the PDFs that it does correctly, if I use the vertical translate function, it translates vertically as expected.

Even more confusing, on some PDFs, I get an error:

PyPDF2.utils.PdfReadError: Multiple definitions in dictionary at byte 0x2a22cb for key /PageMode

but if I take the PDFs that I get that error for and print the PDF to PDF, the new PDF works just fine in my program.

If worse comes to worse, I may just convert the PDF to a JPG first and flip that, which I know is much easier, then convert back to PDF, but if I could maintain the PDF being a true vector file, it would be ideal because the files wind up being much smaller than a large JPG.

Any insight is much appreciated!

Answer 1

base on this issue https://github.com/mfenniak/pyPdf/issues/13 and dig into file generic.py
I tried this and it worked:
PyPDF2.PdfFileReader(pdfFileObj, strict=True)

Issue horizontal flipping PDF with PyPDF2 only on some PDFs

Question

1 answers

solution1
0 2021-01-14 07:08:30

Issue horizontal flipping PDF with PyPDF2 only on some PDFs

Question

1 answers

solution1 0 2021-01-14 07:08:30

solution1
0 2021-01-14 07:08:30