I know how to extract text from ppt file using apache poi like this
InputStream fis=new FileInputStream("abcd.ppt");
HSLFSlideShow show=new HSLFSlideShow(fis);
SlideShow ss=new SlideShow(show);
Slide[] slides=ss.getSlides();
StringBuilder builder = new StringBuilder();
for(int x=0; x < slides.length; x++)
{
TextRun[] runs = slides[x].getTextRuns();
for(int j=0; j<runs.length; j++) {
TextRun run = runs[j];
if(run != null) {
String text = run.getText();
builder.append(text);
}
}
}
but it extracts all footer, slide number that I don't want
So how to extract text except footer and slide number
Thanks in advance
I would recommend that you look at the JPresentation. One of their examples shows how to extract all images and text from all slides: http://www.independentsoft.de/jpresentation/tutorial/exportallslides.html
The API seams to be very easy.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.