简体   繁体   中英

Particular value from a string using regex in c#

I need to extract the $value from the given piece of string .

string text = "<h2 class="knownclass unknownclass1 unknownclass2" title="Example title>$Value </h2>"

Using the code -:

Match m2 = Regex.Match(text, @"<h2 class=""knownclass(.*)</h2>", RegexOptions.IgnoreCase);

It gets me the full value -: unknownclass1 unknownclass2" title="Example title>$Value .But I just need the $value part. Please tell me .Thanks in advance.

Assuming the string always follows this format, consider the following code:

var index = text.IndexOf(">");
text.Substring(index + 1, text.IndexOf("<", index));

As had been said multiple time, using a Regex for parsing HTML or XML is bad. Ignoring that, you are capturing too much. Here is an alternative Regex that should work.

@"<h2 class=""knownclass[^""]*"">(.*)</h2>"

If its always the same pattern of your string, you can consider this:

string text = "<h2 class=\"knownclass unknownclass1 unknownclass2\" title=\"Example title>$Value </h2>";
string result = "";

Regex test = new Regex(@"\<.*?\>(.*?)\</h2\>");
MatchCollection matchlist = test.Matches(text);

if (matchlist.Count > 0)
{
    for (int i = 0; i < matchlist.Count; i++)
    {
        result = matchlist[i].Groups[1].ToString();
    }
}

But if you are working with XML files or HTML files, I recommend you use XmlTextReader for XML and HtmlAgilityPack for HTML

http://msdn.microsoft.com/en-us/library/system.xml.xmltextreader.aspx

http://htmlagilitypack.codeplex.com/

hope it helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM