简体   繁体   English

从包含内容的字符串中删除 HTML 标签

[英]Remove HTML tags from a String with content

I have a string = "195121<span class="up">+432</span>" .我有一个string = "195121<span class="up">+432</span>" I need regEx to remove tags with its content (result string = "195121" )我需要 regEx 删除带有其内容的标签(结果string = "195121"

您可以尝试以下基于捕获组的正则表达式。

string.replaceAll("(?s)<(\\w+)\\b[^<>]*>.*?</\\1>", "");

The main regex works for me are below;对我有用的主要正则表达式如下; It removes all content with a given tag name.它删除具有给定标签名称的所有内容。

"(?is)<your_tag_name[^>]+>.*?<\\/your_tag_name>"

I manage it this way.我是这样管理的。 Hope it helps others.希望它可以帮助其他人。

var data = "<p>Dhaka is the capital city of Bangladesh " +
    "and many palaces and mosques remain. This is" +
    " fast-growing modern metropolis.</p>\\r\\n<p>&lt;flightnode to=\"CXB\"&gt;&lt;/flightnode&gt;</p>"

First replace &lt;首先替换&lt; and &gt;&gt; to < and >到 < 和 >

// This replacement not needed if it's already been there
data = data.replace("&lt;", "<").replace("&gt;", ">")

Then print & check it.然后打印并检查它。

println("\n\n $data")

> //output //-> <p>Dhaka is the capital city of Bangladesh and many
> palaces and mosques remain. This is fast-growing modern
> metropolis.</p><p><flightnode to="CXB"></flightnode></p>

Set tags array you want to remove with its elements ;设置要与其元素一起删除的标签数组

val tag = arrayOf("flightnode", "hotelnode ", "packagenode")

Then loop throught your string然后遍历你的字符串

for (value in tag) {
    val patternString = "(?is)<$value[^>]+>.*?<\\/$value>"
    val pattern = compile(patternString)
    val matcher = pattern.matcher(data)
    println("\n\n" + matcher.find())
    data = matcher.replaceAll("")
}

Print to check it.打印以检查它。

println("\n\n" + data)

> // output // -> <p>Dhaka is the capital city of Bangladesh and many
> palaces and mosques remain. This is fast-growing modern
> metropolis.</p>\r\n<p></p>

Thanks my ex-colleague @masud-bappy for creating regex.感谢我的前同事@masud-bappy创建正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM