简体   繁体   中英

Extract string between html tags

I am trying to extract a string from html tags

<title>what i want</title>

I know there are other simiral or even identical questions answered but the answers on those don't seem to work on me. My current code is

String html = wc.DownloadString("URL");
Match m = Regex.Match(html, "<title>(.*)</title>", RegexOptions.Singleline); 
MessageBox(m.Value);

This outputs

<title>what i want</title>

Not

what i want

Note that i have used other regural expressions from different answers and got the same result I am also not familiar with regural expressions so this may be a noob question.

Try m.Groups[1].Value ( documentation for Groups ), or m.Result("$1") ( documentation for Result ); either should work.

The object m which was returned by Regex.Match is an object that contains various pieces of information about what was matched. This includes both the entire string that was matched, including in this case the title tags themselves, and the parts of the string matched by each group of parentheses. m.Value gives the entire string; m.Groups[1].Value gives the part matched by the first group, m.Groups[2].Value gives the part matched by the second group, etc. This has to be done outside the regular expression because a program might want more than one group; for instance, if you're matching a time of day, like (\\d+):(\\d+) , then you might want to assign the hours ( m.Groups[1].Value ) to one variable and the minutes ( m.Groups[2].Value ) to a different variable.

var value = m.Groups[1].Value;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM