简体   繁体   English

是否有将HTML转换为纯文本的函数?

[英]Is there a function that converts HTML to plaintext?

Is there a "hocus-pocus" function, suitable for Android, that converts HTML to plaintext? 是否有适用于Android的“hocus-pocus”功能,可将HTML转换为纯文本?

I am referring to a function like the clipboard conversion operation found in browsers like Internet Explorer, Firefox, etc: If you select all rendered HTML inside the browser and copy/paste it to a text editor, you will receive (most of) the text, without any HTML tags or headers. 我指的是像Internet Explorer,Firefox等浏览器中的剪贴板转换操作这样的函数:如果在浏览器中选择所有呈现的HTML并将其复制/粘贴到文本编辑器,您将收到(大部分)文本,没有任何HTML标签或标头。

In a similar thread, I saw a reference to html2text but it's in Python. 在一个类似的线程中,我看到了对html2text的引用,但它是在Python中。 I am looking for an Android/Java function. 我正在寻找Android / Java功能。

Is there something like this available or must I do this myself, using Jsoup or Jtidy? 有没有这样的东西,或者我必须自己这样做,使用Jsoup还是Jtidy?

I'd try something like: 我会尝试这样的事情:

String html = "<b>hola</b>";
String plain = Html.fromHtml(html).toString();

Using JSOUP : 使用JSOUP:

String plain = new HtmlToPlainText().getPlainText(Jsoup.parse(html));

Without JSOUP: 没有JSOUP:

String html= "htmltext";
String newHtml = html.replaceAll("(?s)<[^>]*>(\\s*<[^>]*>)*", " ").trim();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM