[英]C# Regex find multiple HTML tags in one pattern
I have a random message (I don't know what will be the content) however, I know that is may contain HTML tags like <b>
and <a href=>
... then I know that there is no more HTML tag than these.我有一条随机消息(我不知道内容是什么)但是,我知道它可能包含 HTML 标签,如<b>
和<a href=>
...然后我知道不再有 HTML 标签比这些。 So, I am looking for a pattern which will be able to recognize and get the content between bold markup, also hyperlink and its content.因此,我正在寻找一种能够识别和获取粗体标记、超链接及其内容之间的内容的模式。 I already did this code:我已经做了这个代码:
string pattern = "(<b>(.*)</b>)|(<a href=.*?>(.*?)<\\/a>)";
Match match = Regex.Match(content, pattern);
while (match.Success)
if (match.Groups[0].Value.Contains("<b>"))
{
messageBlock.Dispatcher.Invoke(delegate
{
messageBlock.Inlines.Add(new Run(content.Substring(0, match.Index)));
messageBlock.Inlines.Add(new Bold(new Run(match.Groups[1].Value)));
});
}
else if (match.Groups[0].Value.Contains("<a href="))
}
}
Nevertheless with this pattern, I can't recover the content match by example <a href=?>
... It only works for the bold tag.. Thank you尽管如此,使用这种模式,我无法通过示例恢复内容匹配<a href=?>
...它仅适用于粗体标签.. 谢谢
For parsing html is better to use Html Agility pack对于解析 html 最好使用Html 敏捷包
Try @"(?s)<(?:(a)(?=\s)(?=(?:[^>""']|""[^""]*""|'[^']*')*?\shref\s*=(?:(['""])(.*?)\2))\s+(?:"".*?""|'.*?'|[^>]*?)+|b\s*)>(.*?)</(?(1)a|b)\s*>"
试试@"(?s)<(?:(a)(?=\s)(?=(?:[^>""']|""[^""]*""|'[^']*')*?\shref\s*=(?:(['""])(.*?)\2))\s+(?:"".*?""|'.*?'|[^>]*?)+|b\s*)>(.*?)</(?(1)a|b)\s*>"
Where在哪里
a
was found, otherwise the b
was found如果 Grp1 匹配,则找到a
,否则找到b
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.