简体   繁体   中英

Regular expression to not match a string in C#

I have some HTML that I need to parse (in a large document) as text, and the portion I'm interested in looks like this:

...
<div id="whatever" class="whatever whatever">some title with <em>html</em> and other such tags in it, but never a div tag</div>
...

Now I want to get out of it the text within the DIV with the HTML. Here's what I have for the Regular expression (using groups):

<div id=\"whatever\" class=\"whatever whatever\">(?<title>[^</div>]*?)</div>

So the idea there is that I'll match the whole thing, and get a group with all the text up to the point where the < /div > occurs (as there's no other identifying factor for the end of the string).

The ^ in [] doesn't work because it's "any" of those characters, not the string "< /div >" that I want. Any ideas how I make this work?

Match m=Regex.Match(s,"\\<div id=\"whatever\" class=\"whatever whatever\">(.*?)\\<\\/div\\>");                                                       
Console.WriteLine(m.Groups[1].Value);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM