从具有条件的字符串中删除特殊字符

Question

我有一个字符串如下 -

"This is <h2>a place/h2>
<p>You know its a good place!</p>
<ul>
    <li>Booked your ticket #20130114074912_AN3P703C on Monday, January 14</li>
</ul>"

所以，我希望我的字符串如下

"This is a place
You know its a good place.
Booked your ticket #20130114074912_AN3P703C on Monday, January 14"

Answer 1

我相信这是你的答案。 链接

尝试这个：

// <summary>
/// Remove HTML from string with Regex.
/// </summary>
public static string StripTagsRegex(string source)
{
   return Regex.Replace(source, "<.*?>", string.Empty);
}

输出：

Input:    <p>The <b>dog</b> is <i>cute</i>.</p>
Output:   The dog is cute.

Answer 2

您可以使用以下方法从任何字符串中删除HTML标记

static string StripHTML (string inputString)
{
  return Regex.Replace(inputString, "<.*?>", string.Empty);
}

Answer 3

这样就可以完成你的工作

String neededString = Regex.Replace(source, "<.*?>", string.Empty);

对于更多comcplicated字符串，包含CSS，JavaScript节点u可以使用以下内容

String neededStringRegex.Replace(subjectString, @"<(style|script)[^<>]*>.*?</\1>|</?[a-z][a-z0-9]*[^<>]*>|<!--.*?-->", "")

Answer 4

下载并引用HTML Agility Pack ，然后调用以下内容：

var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(input);
string output = htmlDoc.DocumentNode.InnerText;

这仍然不会删除你的malformed /h2>标签，但它应该处理比正则表达式更多的HTML。

Answer 5

这应该可以解决问题

string input = "This is <h2> a place</h2><p>You know its a good place!</p><ul>    <li>Booked your ticket #20130114074912_AN3P703C on Monday, January 14</li></ul>";
input = Regex.Replace(input, "<.*?>", string.Empty);

这将找到“<>”中包含的所有字符串，并将其替换为“”或空字符串

从具有条件的字符串中删除特殊字符

问题描述

5 个解决方案

解决方案1
2 已采纳 2013-01-30 09:49:18

解决方案2
0 2013-01-30 09:52:29

解决方案3
0 2013-01-30 09:54:45

解决方案4
0 2013-01-30 10:00:02

解决方案5
0 2013-01-30 10:09:36

从具有条件的字符串中删除特殊字符

问题描述

5 个解决方案

解决方案1 2 已采纳 2013-01-30 09:49:18

解决方案2 0 2013-01-30 09:52:29

解决方案3 0 2013-01-30 09:54:45

解决方案4 0 2013-01-30 10:00:02

解决方案5 0 2013-01-30 10:09:36

解决方案1
2 已采纳 2013-01-30 09:49:18

解决方案2
0 2013-01-30 09:52:29

解决方案3
0 2013-01-30 09:54:45

解决方案4
0 2013-01-30 10:00:02

解决方案5
0 2013-01-30 10:09:36