![](/img/trans.png)
[英]how to extract text from ppt, pptx file except footer, slide number using apache poi?
[英]How to get pptx slide notes text using apache poi?
到目前為止,我只有一個用於從 ppt 幻燈片筆記中檢索文本的工作代碼
try {
FileInputStream is = new FileInputStream("C:\\sample\\test.ppt");
SlideShow ppt = new SlideShow(is);
Slide[] slide = ppt.getSlides();
for (int i = 0; i < slide.length; i++) {
System.out.println(i);
TextRun[] runs = slide[i].getNotesSheet().getTextRuns();
if (runs.length < 1) {
System.out.println("null");
} else {
for (TextRun run : runs) {
System.out.println(" > " + run.getText());
}
}
}
} catch (IOException ioe) {
}
但是如何從 pptx 幻燈片筆記中檢索文本?
經過不斷的反復試驗,找到了解決方案。
try {
FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
XMLSlideShow pptxshow = new XMLSlideShow(fis);
XSLFSlide[] slide2 = pptxshow.getSlides();
for (int i = 0; i < slide2.length; i++) {
System.out.println(i);
try {
XSLFNotes mynotes = slide2[i].getNotes();
for (XSLFShape shape : mynotes) {
if (shape instanceof XSLFTextShape) {
XSLFTextShape txShape = (XSLFTextShape) shape;
for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {
System.out.println(xslfParagraph.getText());
}
}
}
} catch (Exception e) {
}
}
} catch (IOException e) {
}
已接受答案的更新。 這很有效,但是如果您啟用了注釋主文件中的其他部分,例如 header 或頁碼,那么您將獲得您可能沒有預料到的額外注釋段落。 您可以使用以下代碼僅限於實際注釋:
try {
FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx");
XMLSlideShow pptxshow = new XMLSlideShow(fis);
XSLFSlide[] slide2 = pptxshow.getSlides();
for (int i = 0; i < slide2.length; i++) {
System.out.println(i);
try {
XSLFNotes mynotes = slide2[i].getNotes();
for (XSLFShape shape : mynotes) {
if (shape instanceof XSLFTextShape) {
XSLFTextShape txShape = (XSLFTextShape) shape;
// Look for the actual notes only ...
if (!txShape.getShapeName().contains("Notes Placeholder")) {
continue;
}
for (XSLFTextParagraph xslfParagraph : txShape.getTextParagraphs()) {
System.out.println(xslfParagraph.getText());
}
}
}
} catch (Exception e) {
}
}
} catch (IOException e) {
}
給出更好的解決方案。
try (FileInputStream fis = new FileInputStream("C:\\sample\\sample.pptx")) {
XMLSlideShow ppt = new XMLSlideShow(fis);
List<XSLFSlide> slides = ppt.getSlides();
for (XSLFSlide slide : slides) {
try {
XSLFNotes mynotes = slide.getNotes();
for (XSLFShape shape : mynotes) {
if (shape instanceof XSLFTextShape && Placeholder.BODY == ((XSLFTextShape) shape).getTextType()) {
XSLFTextShape txShape = (XSLFTextShape) shape;
System.out.println(txShape.getText());
break;
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
} catch (IOException e) {
e.printStackTrace();
}
與其他答案不同,此代碼使用Placeholder.BODY == ((XSLFTextShape) shape).getTextType()
以便您只能獲取備注文本。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.