简体   繁体   English

正则表达式以匹配表达式,后跟小写字符

[英]Regex to match expression followed by lower case character

I want to match a closing tag followed by an 0+ spaces/newlines followed by an opening tag when followed by a lowercase letter. 我想匹配一个结束标记,后跟0+空格/换行符,然后是小写字母,然后是一个开始标记。 Examples: 例子:

  • text</p> <p>blah matches </p> <p> text</p> <p>blah匹配</p> <p>
  • text</i><i>and more text <b>but not this</b> matches </i><i> text</i><i>and more text <b>but not this</b>匹配</i><i> text</i><i>and more text <b>but not this</b> </i><i>
  • text</i> <i>And more text does not match text</i> <i>And more text不匹配

I tried this: </.*?>\\s*\\n*\\s*<.*>(?=[az]) , but it doesn't work for the second example, as it will match </i><i> and more text </b> even though the question mark should make it "lazy". 我试过这个: </.*?>\\s*\\n*\\s*<.*>(?=[az]) ,但是它对第二个示例不起作用,因为它将匹配</i><i> and more text </b>即使问号应使其“惰性”。

Making a quantifier lazy only makes the regex try the shortest possible match first , but if that doesn't work, it will gladly expand the match until the entire regex succeeds. 制作一个量词懒惰不仅使正则表达式首先尝试最短的比赛,但如果不工作,它会很高兴地扩大了比赛,直到整个正则表达式成功。

You need to be more specific in what you allow to match - for example by not allowing angle brackets inside a tag: 您需要在允许匹配的内容上更加具体-例如,在标签内不允许使用尖括号:

</[^<>]*>\s*<[^/][^<>]*>(?=[a-z])

(Also, \\s already contains \\n , so \\s*\\n*\\s* can be shortened to \\s* ) (此外, \\s已经包含\\n ,因此\\s*\\n*\\s*可以缩写为\\s*

Try: 尝试:

</[^>]+>\s*<[^/>]+>(?=[a-z])

Change the '+' to '*' if you want to be able to match empty tags 如果您希望能够匹配空标签,请将“ +”更改为“ *”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM