简体   繁体   English

从Div Tag C#RegEx提取内容

[英]Extract Content from Div Tag C# RegEx

I need to extract this content inside the divtestimonial1 div I am using the following regEx, but its only returning the first line 我需要在divtestimonial1 div中提取此内容,我正在使用以下regEx,但它仅返回第一行

Regex r = new Regex("&lt;div([^<]*<(?!/div>))");
<div class="testimonial_content" id="divtestimonial1">
          <a name="T1"></a>
          <div class="testimonial_headline">%testimonial1headline</div>
          <p align="left"><img src="" alt="" width="193" height="204" align="left" hspace="10" id="img_T1"/><span class="testimonial_text">%testimonial1text</span><br />
          </p>
  </div>

Regular expressions are generally not a good choice for parsing HTML . 正则表达式通常不是解析HTML的好选择 You might be better off using a tool such as HTML Agility Pack , so I would suggest you use that. 使用HTML Agility Pack之类的工具可能会更好,所以我建议您使用它。

That being said, you can match your particular sample input using this Regex: 话虽如此,您可以使用此Regex匹配您的特定样本输入:

<div.*?id="divtestimonial1".*?>.*</div>

But it might break in your real-world scenario. 但是,在您的实际情况中,它可能会中断。 One of the troubles with Regex and HTML is properly detecting nesting of tags, etc. 正则表达式和HTML的问题之一是正确检测标记的嵌套等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM