简体   繁体   中英

How to remove selected special character from list

I have C# list where lot of values like this

<b>Moon</b>

and i want to remove <b> and </b> .

I want result like this Moon .

How can i remove this type of characters from list.

You can use XDocument to remove the XML tags:

string StripXmlTags(string xml)
{
    XDocument doc = XDocument.Parse(xml);
    return doc.Root.Value;
}

Example:

[Test]
public void Test()
{
    string xml = "<root><b>nice </b><c>node</c><d><e> is here</e></d></root>";
    string result = StripXmlTags(xml);

    Assert.AreEqual("nice node is here", result);
}

Try this:

var moonHtml = "<b>Moon</b>";
var regex = new Regex("</?(.*)>", RegexOptions.IgnoreCase | RegexOptions.Multiline);
var moon = regex.Replace(moonHtml, string.Empty);

尝试这个:

Regex.Replace("<b>Moon</b>", @"\<.+?\>", "")
string noHtml = Regex.Replace(inputWithHtmlTags, "<[^>]+>", "");

This program is a very crude illustration of a regex that will remove all tags, it's flexible enough to also remove italic and underlines. It use the IgnoreCase option to guard against <b> or <B> being in the input and will carry out the search over multiple lines. The output from running this will be "The Man on the Moon". I use .*? meaning zero or more to catch cases of empty brackets such as <>

using System;
using System.Text.RegularExpressions;

namespace ConsoleApplication3
{
    class Program
    {
       static void Main(string[] args)
       {
           var input = "<b>The</b> <i>Man</i> on the <U><B>Moon</B></U>";

           var regex = new Regex("<.*?>", RegexOptions.IgnoreCase | RegexOptions.Multiline);

           var output = regex.Replace(input, string.Empty);

           Console.WriteLine(output);
           Console.ReadLine();
      }
    }

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM