简体   繁体   English

根据字符串将多页 pdf 拆分为单个 pdf 并使用该字符串 python

[英]split a multiple page pdf with multiple pages to individual pdfs based on a string and save using that string python

I have a single invoices pdf with multiple invoices inside it, the pdf is organized in such a way that some page has invoice number and that invoice detail continues to a second or third page.我有一张单张发票 pdf 里面有多张发票,pdf 的组织方式是,某些页面有发票编号,发票详细信息继续到第二页或第三页。 What I want to do is split the pdf into individual pdf files based on the invoice number, for example, the total number of pages = 10.我要做的是根据发票编号将 pdf 拆分为单独的 pdf 文件,例如总页数 = 10。

page 1: invoice 1 continued to page 2 page 3: invoice 2 continued to page 4 page 5: invoice 3 continued to page 6 page 7: invoice 4 continued to page 8 page 9: invoice 5 continued to page 10第 1 页:发票 1 接第 2 页 第 3 页:发票 2 接第 4 页 第 5 页:发票 3 接第 6 页 第 7 页:发票 4 接第 8 页 第 9 页:发票 5 接第 10 页

i want to split if the page contains the word invoice then split it with pages before the next invoice word, for the output i am looking for is: invoice 1.pdf (2 pages page 1 to 2) invoice 2.pdf (2 pages page 3 to 4) invoice 3.pdf (2 pages page 5 to 6) invoice 4.pdf (2 pages page 7 to 8) invoice 5.pdf (2 pages page 9 to 10)如果页面包含发票一词,我想拆分,然后将其与下一个发票词之前的页面拆分,对于我正在寻找的 output 是:发票 1.pdf(2 页第 1 到 2 页)发票 2.Z4370075BA49374201第 3 至 4 页)发票 3.pdf(2 页第 5 至 6 页)发票 4.pdf(2 页第 7 至 8 页)发票 5.Z437175BA4191210EE004E2D9 第 3 页

I got the following code online for splitting pdf into individual files, can anyone help to extend this to include the above split logic?我在网上获得了以下代码,用于将 pdf 拆分为单独的文件,任何人都可以帮助扩展它以包含上述拆分逻辑吗?

from PyPDF2 import PdfFileWriter, PdfFileReader

inputpdf = PdfFileReader(open("invoices.pdf", "rb"))

for i in range(inputpdf.numPages):
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("document-page%s.pdf" % i, "wb") as outputStream:
        output.write(outputStream)```

I had to get an application to do this, Its called PDF Content Split SA我必须得到一个应用程序才能做到这一点,它叫做 PDF Content Split SA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM