正则表达式：带有HTML的多行问题

Question

I'm playing around with websites and regular expressions in C#. 我在玩C＃中的网站和正则表达式。 I have this situation: 我有这种情况：

             <a href="path/to/image">
    <img src="thumbnail"></a>

That outlining is how my application gets the content of a given web site. 概述就是我的应用程序如何获取给定网站的内容。 Tabs and breaklines not the same for each row. 每行的制表符和断行符都不相同。

I use gskinner to check the regex (http://gskinner.com/RegExr/) and I have created this regular expression: 我使用gskinner检查正则表达式（http://gskinner.com/RegExr/），并创建了以下正则表达式：

            (?i)<a([^>]+)>\W.*</a>

Flags: Multiline 标志：多行

Gskinner shows that the pattern is correct. Gskinner表明该模式是正确的。 But when I put in c# (regEx.Matches(...)) it can not find the matches anymore. 但是当我放入c＃（regEx.Matches（...））时，它不再找到匹配项。

Does anyone have any clue how to do this? 有人知道如何执行此操作吗？

Thanks 谢谢

Answer 1

using HtmlAgilityPack and your sample string 使用HtmlAgilityPack和您的示例字符串

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

- --

var href = doc.DocumentNode
    .Descendants("a")
    .Select(n => n.Attributes["href"].Value)
    .FirstOrDefault();

var src = doc.DocumentNode
    .Descendants("img")
    .Select(n => n.Attributes["src"].Value)
    .FirstOrDefault();

正则表达式：带有HTML的多行问题

问题描述

1 个解决方案

解决方案1
0 已采纳 2012-05-16 21:15:38

正则表达式：带有HTML的多行问题

问题描述

1 个解决方案

解决方案1 0 已采纳 2012-05-16 21:15:38

解决方案1
0 已采纳 2012-05-16 21:15:38