简体   繁体   English

从RTF文件读取文本

[英]Read Text from RTF file

I tried to read rtf file using Apache POI but I found issues with it. 我尝试使用Apache POI读取rtf文件,但发现了问题。 It reports Invalid Header exception. 它报告无效的报头异常。 It seems like POI doesn't support rtf files. POI似乎不支持rtf文件。 Is there any way to read .rtf using any open source java API . 有什么办法可以使用任何开源Java API读取.rtf。 (I heard about Aspose API but it's not free) (我听说过Aspose API,但这不是免费的)

Any solutions?? 任何解决方案?

You can try the RTFEditorKit . 您可以尝试RTFEditorKit It supports images and text as well. 它也支持图像和文本。

Or look at this answer: Java API to convert RTF file to Word document (97-2003 format) 或查看以下答案: Java API将RTF文件转换为Word文档(97-2003格式)

There is no free library that supports this. 没有免费的库支持此功能。 But it may not be that hard to create a basic compare function yourself. 但是,自己创建一个基本的比较功能可能并不难。 You can read in an rtf file and then extract the text like this: 您可以读取rtf文件,然后提取如下文本:

// read rtf from file
JEditorPane p = new JEditorPane();
p.setContentType("text/rtf");
EditorKit rtfKit = p.getEditorKitForContentType("text/rtf");
rtfKit.read(new FileReader(fileName), p.getDocument(), 0);
rtfKit = null;

// convert to text
EditorKit txtKit = p.getEditorKitForContentType("text/plain");
Writer writer = new StringWriter();
txtKit.write(writer, p.getDocument(), 0, p.getDocument().getLength());
String documentText = writer.toString();

The easiest way is to use the Scanner class in Java and the FileReader object. 最简单的方法是使用Java中的Scanner类和FileReader对象。 Simple example: 简单的例子:

Scanner in = new Scanner(new FileReader("filename.rtf")); 扫描仪=新扫描仪(新FileReader(“ filename.rtf”));

Scanner has several methods for reading in strings, numbers, etc... You can look for more information on this on the Java documentation page. 扫描器有几种读取字符串,数字等的方法。您可以在Java文档页面上找到有关此内容的更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM