繁体   English   中英

如何使用 apache poi 获取 pptx 幻灯片注释文本?

[英]How to get pptx slide notes text using apache poi?

到目前为止,我只有一个用于从 ppt 幻灯片笔记中检索文本的工作代码

try {
    FileInputStream is = new FileInputStream("C:\\sample\\test.ppt");
    SlideShow ppt = new SlideShow(is);

    Slide[] slide = ppt.getSlides();
    for (int i = 0; i < slide.length; i++) {

        System.out.println(i);
        TextRun[] runs = slide[i].getNotesSheet().getTextRuns();
        if (runs.length < 1) {
            System.out.println("null");
        } else {
            for (TextRun run : runs) {
                System.out.println(" > " + run.getText());
            }
        }
    }

} catch (IOException ioe) {

}

但是如何从 pptx 幻灯片笔记中检索文本?

经过不断的反复试验,找到了解决方案。

try {

    FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
    XMLSlideShow pptxshow = new XMLSlideShow(fis);

    XSLFSlide[] slide2 = pptxshow.getSlides();
    for (int i = 0; i < slide2.length; i++) {
        System.out.println(i);
        try {
            XSLFNotes mynotes = slide2[i].getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;
                    for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {
                        System.out.println(xslfParagraph.getText());
                    }
                }
            }
        } catch (Exception e) {

        }

    }
} catch (IOException e) {

}

已接受答案的更新。 这很有效,但是如果您启用了注释主文件中的其他部分,例如 header 或页码,那么您将获得您可能没有预料到的额外注释段落。 您可以使用以下代码仅限于实际注释:

try {

    FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
    XMLSlideShow pptxshow = new XMLSlideShow(fis);

    XSLFSlide[] slide2 = pptxshow.getSlides();
    for (int i = 0; i < slide2.length; i++) {
        System.out.println(i);
        try {
            XSLFNotes mynotes = slide2[i].getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;

                    // Look for the actual notes only ...
                    if (!txShape.getShapeName().contains("Notes Placeholder")) {
                        continue;
                    }

                    for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {                    
                        System.out.println(xslfParagraph.getText());
                    }
                }
            }
        } catch (Exception e) {

        }

    }
} catch (IOException e) {

}

给出更好的解决方案。

try (FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx")) {
    XMLSlideShow ppt = new XMLSlideShow(fis);
    List<XSLFSlide> slides = ppt.getSlides();
    for (XSLFSlide slide : slides) {
        try {
            XSLFNotes mynotes = slide.getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape && Placeholder.BODY == ((XSLFTextShape) shape).getTextType()) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;
                    System.out.println(txShape.getText());
                    break;
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
} catch (IOException e) {
    e.printStackTrace();
}

与其他答案不同,此代码使用Placeholder.BODY == ((XSLFTextShape) shape).getTextType()以便您只能获取备注文本。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM