[英]Regex to match expression followed by lower case character
I want to match a closing tag followed by an 0+ spaces/newlines followed by an opening tag when followed by a lowercase letter. 我想匹配一个结束标记,后跟0+空格/换行符,然后是小写字母,然后是一个开始标记。 Examples:
例子:
text</p> <p>blah
matches </p> <p>
text</p> <p>blah
匹配</p> <p>
text</i><i>and more text <b>but not this</b>
matches </i><i>
text</i><i>and more text <b>but not this</b>
匹配</i><i>
text</i><i>and more text <b>but not this</b>
</i><i>
text</i> <i>And more text
does not match text</i> <i>And more text
不匹配 I tried this: </.*?>\\s*\\n*\\s*<.*>(?=[az])
, but it doesn't work for the second example, as it will match </i><i> and more text </b>
even though the question mark should make it "lazy". 我试过这个:
</.*?>\\s*\\n*\\s*<.*>(?=[az])
,但是它对第二个示例不起作用,因为它将匹配</i><i> and more text </b>
即使问号应使其“惰性”。
Making a quantifier lazy only makes the regex try the shortest possible match first , but if that doesn't work, it will gladly expand the match until the entire regex succeeds. 制作一个量词懒惰不仅使正则表达式首先尝试最短的比赛,但如果不工作,它会很高兴地扩大了比赛,直到整个正则表达式成功。
You need to be more specific in what you allow to match - for example by not allowing angle brackets inside a tag: 您需要在允许匹配的内容上更加具体-例如,在标签内不允许使用尖括号:
</[^<>]*>\s*<[^/][^<>]*>(?=[a-z])
(Also, \\s
already contains \\n
, so \\s*\\n*\\s*
can be shortened to \\s*
) (此外,
\\s
已经包含\\n
,因此\\s*\\n*\\s*
可以缩写为\\s*
)
Try: 尝试:
</[^>]+>\s*<[^/>]+>(?=[a-z])
Change the '+' to '*' if you want to be able to match empty tags 如果您希望能够匹配空标签,请将“ +”更改为“ *”
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.