[英]Extract Content from Div Tag C# RegEx
I need to extract this content inside the divtestimonial1 div I am using the following regEx, but its only returning the first line 我需要在divtestimonial1 div中提取此内容,我正在使用以下regEx,但它仅返回第一行
Regex r = new Regex("<div([^<]*<(?!/div>))");
<div class="testimonial_content" id="divtestimonial1"> <a name="T1"></a> <div class="testimonial_headline">%testimonial1headline</div> <p align="left"><img src="" alt="" width="193" height="204" align="left" hspace="10" id="img_T1"/><span class="testimonial_text">%testimonial1text</span><br /> </p> </div>
Regular expressions are generally not a good choice for parsing HTML . 正则表达式通常不是解析HTML的好选择 。 You might be better off using a tool such as HTML Agility Pack , so I would suggest you use that.
使用HTML Agility Pack之类的工具可能会更好,所以我建议您使用它。
That being said, you can match your particular sample input using this Regex: 话虽如此,您可以使用此Regex匹配您的特定样本输入:
<div.*?id="divtestimonial1".*?>.*</div>
But it might break in your real-world scenario. 但是,在您的实际情况中,它可能会中断。 One of the troubles with Regex and HTML is properly detecting nesting of tags, etc.
正则表达式和HTML的问题之一是正确检测标记的嵌套等。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.