使用java.util.regex的HTML正则表达式

Question

I need a regex for following html : 我需要一个正则表达式来跟随html：

<div xmlns="http://www.w3.org/1999/xhtml">    <p/>
  <p/><p/>    <p/>
</div>

This comes form a richtext field and obviously this is no meaningful content or means: empty. 这来自一个richtext字段，显然这不是有意义的内容或意味着：空。 I can not say in java: if (richTextConent == null || richTextContent.length == 0) because the richtext field contains something. 我不能在java中说：if（richTextConent == null || richTextContent.length == 0）因为richtext字段包含一些东西。 Semantically the above content is empty so i thought of using a regex. 从语义上讲，上面的内容是空的，所以我想到了使用正则表达式。 I need to match this snippet with java.util.regex 我需要将此代码段与java.util.regex相匹配

If there is something meaningful in the snippet like: 如果代码段中有一些有意义的内容，例如：

<div xmlns="http://www.w3.org/1999/xhtml"> text<p/>
  <p/><p/>text    <p/>
</div>

than the regex should not match. 比正则表达式不应该匹配。

Answer 1

Use a HTML parser like Jsoup . 使用像Jsoup这样的HTML解析器。

String html1 = "<div xmlns=\"http://www.w3.org/1999/xhtml\">    <p/>  <p/><p/>    <p/></div>";
String html2 = "<div xmlns=\"http://www.w3.org/1999/xhtml\"> text<p/>        <p/><p/>text    <p/>        </div>";

System.out.println(Jsoup.parse(html1).text().isEmpty()); // true
System.out.println(Jsoup.parse(html2).text().isEmpty()); // false

使用java.util.regex的HTML正则表达式

问题描述

1 个解决方案

解决方案1
3 已采纳 2010-07-16 17:43:06

See also: 也可以看看：

使用java.util.regex的HTML正则表达式

问题描述

1 个解决方案

解决方案1 3 已采纳 2010-07-16 17:43:06

See also: 也可以看看：

解决方案1
3 已采纳 2010-07-16 17:43:06