[英]Matching last occurance of character using Regex
I need to match: 我需要匹配:
<p><span style="font-size: 18px;"><strong>Hello</strong></span></p>
I need to match the text hello between the last >
and the first </
我需要在最后一个
>
和第一个</
之间匹配文本问好
Using (?=>)(.*?)(?=</)
returns <span style="font-size: 18px;"><strong>Hello
使用
(?=>)(.*?)(?=</)
返回<span style="font-size: 18px;"><strong>Hello
Thanks! 谢谢!
I know this is not the answer you were looking for but parsing html with regex is like eating soup with a fork. 我知道这不是您要找的答案,但是用regex解析html就像用叉子吃汤。 You'll get the job done eventually but it's very frustrating.
您最终会完成工作,但这非常令人沮丧。
Try this instead and keep your sanity: 尝试以下方法,保持理智:
string html = "<p><span style=\"font-size: 18px;\"><strong>Hello</strong></span></p>";
System.Xml.Linq.XDocument doc = System.Xml.Linq.XDocument.Parse(html);
string hello = doc.Descendants().LastOrDefault().Value;
You could go with 你可以去
/>([^<>]+)</
That should give you the desired match. 那应该给您想要的比赛。
Do you only need to match this specific string? 您只需要匹配此特定字符串? If yes, then you could simply use:
如果是,那么您可以简单地使用:
/<strong>([^<]*)</strong>/
which will match any text between the strong
tags. 它将匹配
strong
标签之间的任何文本。
Try this 尝试这个
The constant variable for regex is 正则表达式的常数为
const string HTML_TAG_PATTERN = "<.*?>";
The function 功能
static string StripHTML(string inputString)
{
return Regex.Replace
(inputString, HTML_TAG_PATTERN, string.Empty);
}
and call the function like 然后像这样调用函数
string str = "<p><span style='font-size: 18px;'><strong>Hello</strong></span></p>";
str = StripHTML(str);
I think your first look ahead
must look more like : (?<=>)
( look behind
for >
) 我认为您的
look ahead
必须更像: (?<=>)
( look behind
>
)
And replace .*?
并替换
.*?
by [^<>]*
(anything but <
or >
). 通过
[^<>]*
(除<
或>
任何字符)。
If you need to keep your look around
you can do : (?<=>)([^<>]*)(?=</)
如果您需要
look around
可以执行以下操作: (?<=>)([^<>]*)(?=</)
If not, you can simply do : >([^<>]*)</
如果没有,您可以简单地做:
>([^<>]*)</
The difference is that using look around
you won't capture <
neither </
in the global match. 不同之处在于,使用
look around
您不会在全局匹配中捕获<
都不</
div </
strong </
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.