[英]Java get plain Text from RTF
I have on my database a column that holds text in RTF format. 我的数据库中有一个包含RTF格式文本的列。 How can I get only the plain text of it, using Java?
我怎样才能使用Java获得它的纯文本?
RTFEditorKit rtfParser = new RTFEditorKit();
Document document = rtfParser.createDefaultDocument();
rtfParser.read(new ByteArrayInputStream(rtfBytes), document, 0);
String text = document.getText(0, document.getLength());
this should work 这应该工作
If you can try "AdvancedRTFEditorKit", it might be cool. 如果您可以尝试“AdvancedRTFEditorKit”,它可能很酷。 Try here http://java-sl.com/advanced_rtf_editor_kit.html
试试这里http://java-sl.com/advanced_rtf_editor_kit.html
I have used it to create a complete RTF editor, with all the supports MS Word has. 我用它来创建一个完整的RTF编辑器,具有MS Word所具有的所有支持。
Apache POI will also read Microsoft Word formats, not just RTF. Apache POI还将读取Microsoft Word格式,而不仅仅是RTF。
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
public String getRtfText(String fileName) {
File rtfFile = null;
WordExtractor rtfExtractor = null ;
try {
rtfFile = new File(fileName);
//A FileInputStream obtains input bytes from a file.
FileInputStream inStream = new FileInputStream(rtfFile.getAbsolutePath());
//A HWPFDocument used to read document file from FileInputStream
HWPFDocument doc=new HWPFDocument(inStream);
rtfExtractor = new WordExtractor(doc);
}
catch(Exception ex)
{
System.out.println(ex.getMessage());
}
//This Array stores each line from the document file.
String [] rtfArray = rtfExtractor.getParagraphText();
String rtfString = "";
for(int i=0; i < rtfArray.length; i++) rtfString += rtfArray[i];
System.out.println(rtfString);
return rtfString;
}
This works if the RTF text is in a JEditorPane 如果RTF文本位于JEditorPane中,则此方法有效
String s = getPlainText(aJEditorPane.getDocument());
String getPlainText(Document doc) {
try {
return doc.getText(0, doc.getLength());
}
catch (BadLocationException ex) {
System.err.println(ex);
return null;
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.