如何使用 apache poi 获取 pptx 幻灯片注释文本？

Question

到目前为止，我只有一个用于从 ppt 幻灯片笔记中检索文本的工作代码

try {
    FileInputStream is = new FileInputStream("C:\\sample\\test.ppt");
    SlideShow ppt = new SlideShow(is);

    Slide[] slide = ppt.getSlides();
    for (int i = 0; i < slide.length; i++) {

        System.out.println(i);
        TextRun[] runs = slide[i].getNotesSheet().getTextRuns();
        if (runs.length < 1) {
            System.out.println("null");
        } else {
            for (TextRun run : runs) {
                System.out.println(" > " + run.getText());
            }
        }
    }

} catch (IOException ioe) {

}

但是如何从 pptx 幻灯片笔记中检索文本？

Answer 1

经过不断的反复试验，找到了解决方案。

try {

    FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
    XMLSlideShow pptxshow = new XMLSlideShow(fis);

    XSLFSlide[] slide2 = pptxshow.getSlides();
    for (int i = 0; i < slide2.length; i++) {
        System.out.println(i);
        try {
            XSLFNotes mynotes = slide2[i].getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;
                    for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {
                        System.out.println(xslfParagraph.getText());
                    }
                }
            }
        } catch (Exception e) {

        }

    }
} catch (IOException e) {

}

Answer 2

已接受答案的更新。 这很有效，但是如果您启用了注释主文件中的其他部分，例如 header 或页码，那么您将获得您可能没有预料到的额外注释段落。 您可以使用以下代码仅限于实际注释：

try {

    FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
    XMLSlideShow pptxshow = new XMLSlideShow(fis);

    XSLFSlide[] slide2 = pptxshow.getSlides();
    for (int i = 0; i < slide2.length; i++) {
        System.out.println(i);
        try {
            XSLFNotes mynotes = slide2[i].getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;

                    // Look for the actual notes only ...
                    if (!txShape.getShapeName().contains("Notes Placeholder")) {
                        continue;
                    }

                    for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {                    
                        System.out.println(xslfParagraph.getText());
                    }
                }
            }
        } catch (Exception e) {

        }

    }
} catch (IOException e) {

}

Answer 3

给出更好的解决方案。

try (FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx")) {
    XMLSlideShow ppt = new XMLSlideShow(fis);
    List<XSLFSlide> slides = ppt.getSlides();
    for (XSLFSlide slide : slides) {
        try {
            XSLFNotes mynotes = slide.getNotes();
            for (XSLFShape shape : mynotes) {
                if (shape instanceof XSLFTextShape && Placeholder.BODY == ((XSLFTextShape) shape).getTextType()) {
                    XSLFTextShape txShape = (XSLFTextShape) shape;
                    System.out.println(txShape.getText());
                    break;
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
} catch (IOException e) {
    e.printStackTrace();
}

与其他答案不同，此代码使用Placeholder.BODY == ((XSLFTextShape) shape).getTextType()以便您只能获取备注文本。

如何使用 apache poi 获取 pptx 幻灯片注释文本？

问题描述

3 个解决方案

解决方案1
7 已采纳 2014-07-21 20:44:46

解决方案2
0 2022-08-13 12:37:42

解决方案3
0 2023-01-28 02:17:59

如何使用 apache poi 获取 pptx 幻灯片注释文本？

问题描述

3 个解决方案

解决方案1 7 已采纳 2014-07-21 20:44:46

解决方案2 0 2022-08-13 12:37:42

解决方案3 0 2023-01-28 02:17:59

解决方案1
7 已采纳 2014-07-21 20:44:46

解决方案2
0 2022-08-13 12:37:42

解决方案3
0 2023-01-28 02:17:59