简体   繁体   English

Pdfbox在具有不同页面范围和文件名的多个文件中拆分PDF

[英]Pdfbox split PDF in multi files with different page Ranges and Filenames

I have a question to Apache PDFBox.我对 Apache PDFBox 有疑问。 Is it possible to split a pdf file in diffrent files with diffrent page ranges and file names?是否可以将 pdf 文件拆分为具有不同页面范围和文件名的不同文件?

Example:例子:

  • page 1 - 5 filename: part1.pdf第 1 - 5 页文件名:part1.pdf
  • page 6 filename: part2.pdf第 6 页文件名:part2.pdf
  • page 7 - 10 filename: part3.pdf第 7 - 10 页文件名:part3.pdf
  • ... ...

This might be too late but here's a solution for future readers.这可能为时已晚,但这里为未来的读者提供了一个解决方案。

Using PDFBOX 2.0+ :使用PDFBOX 2.0+

private static void splitPdf(int startIndex, int endIndex) {
    if (document.getNumberOfPages() > 20) {
        System.out.println(document.getDocumentInformation().getTitle());
        try {
            Splitter splitter = new Splitter();
            splitter.setSplitAtPage(endIndex - startIndex + 1);
            List<PDDocument> splittedList = splitter.split(document);
            int start = 1, end = 20;
            for (PDDocument doc : splittedList) {
                doc.save("/home/Downloads/pdfs/" + document.getDocumentInformation().getTitle()
                        + "_" + start + "_" + end + ".pdf");
                start = end + 1;
                end += 20;
                if (end > document.getNumberOfPages()) {
                    end = document.getNumberOfPages();
                }
                doc.close();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

And I am calling this method as:我将此方法称为:

splitPdf(1, 20)

Explanation :说明

Here, I am splitting pdfs to 20 pages.在这里,我将 pdf 拆分为 20 页。 Feel free to change the number as your need.您可以根据需要随意更改号码。

Here is the documentation for .setSplitAtPage() :这是.setSplitAtPage()的文档:

https://pdfbox.apache.org/docs/2.0.3/javadocs/org/apache/pdfbox/multipdf/Splitter.html https://pdfbox.apache.org/docs/2.0.3/javadocs/org/apache/pdfbox/multipdf/Splitter.html

From docs: This will tell the splitting algorithm where to split the pages.来自文档:这将告诉拆分算法在何处拆分页面。 The default is 1, so every page will become a new document.默认为 1,所以每一页都会成为一个新文档。 If it was two then each document would contain 2 pages.如果是两个,那么每个文档将包含 2 页。 If the source document had 5 pages it would split into 3 new documents, 2 documents containing 2 pages and 1 document containing one page.如果源文档有 5 页,它将拆分为 3 个新文档,2 个包含 2 页的文档和 1 个包含一页的文档。

The start and end variables are used just for file names. startend变量仅用于文件名。

Hope this helps!希望这有帮助!

This is my answer.这是我的回答。 With me it good working和我一起工作很好

    private static void splitPdf(PDDocument document, String fileName, int fromPage, int toPage) {
    if (document.getNumberOfPages() > 20) {
        System.out.println(document.getDocumentInformation().getTitle());
        try {
            Splitter splitter = new Splitter();
            splitter.setStartPage(fromPage);
            splitter.setEndPage(toPage);
            splitter.setSplitAtPage(toPage); 
            List<PDDocument> splittedList = splitter.split(document);
            for (PDDocument doc : splittedList) {
                doc.save(PATH_SAVE_FILE + fileName + ".pdf"); 
                doc.close();                    
            }                 
            System.out.println("Save successful file : " + fileName);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The main statement you may be missing is:您可能缺少的主要语句是:

splitter.setEndPage(toPage);

Hope this will help希望这会有所帮助

int fromPage = 1;
int toPage = 5 ;

File pdfFile = new File("<filePath-to-main-pdf>");
pdfDocument = PDDocument.load(pdfFile);

Splitter splitter = new Splitter();

splitter.setStartPage(fromPage);
splitter.setEndPage(toPage);
splitter.setSplitAtPage(toPage - fromPage +1 );

List<PDDocument> lst =splitter.split(pdfDocument);

PDDocument pdfDocPartial = lst.get(0);
File f = new File("<filePath-WithName>");
pdfDocPartial.save(f);
    int numberOfPagesFileShouldHave = 50;
    PDDocument document = PDDocument.load(inputFile);            
    if (document.getNumberofPages() > 50) {
        try {
            int divideIntoFiles = Math.abs(document.getNumberOfPages() / numberOfPagesFileShouldHave) + 1;
            System.out.println("Divide Into FIles =" + divideIntoFiles);
            int startIndex = 1;
            int endIndex= numberOfPagesFileShouldHave; 
                for (int i = 1; i < divideIntoFiles; i++) {
                    Splitter splitter =  new Splitter();
                splitter.setStartPage(startIndex);
                splitter.setEndPage(endIndex);
                splitter.setSplitAtPage(endIndex);
                List <PDDocument> splittedList = splitter.split(document);
                for (PDDocument doc : splittedList) {
                    doc.save("C:\\Work2022\\_darksiderg" + " " + startIndex + endIndex + ".pdf");
                    doc.close();
                    startIndex  = endIndex + 1;
                    endIndex = endIndex + numberOfPagesFileShouldHave;
                }
                }
        }catch (Exception e) {
                    e.printStackTrace();
                }
                
            }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM