简体   繁体   English

编写此正则表达式的更好方法? 负前瞻

[英]Better way to write this Regex? Negative Lookahead

I think I've got this working for the most part, but was wondering if there is a better way to write it: 我想我大部分时间都在工作,但是想知道是否有更好的方法可以编写它:

/\b(Word)(?!.*?<\/a>)(?!.*?>)\b/

I'm trying to match Word when it's NOT linked, and it's NOT part of HTML tags (like <a href="" title="Word"> should not match). 我试图匹配未链接的Word,并且它不是HTML标记的一部分(例如<a href="" title="Word">不匹配)。

From what I understand, it's better to use negated character classes if possible rather than making it lazy. 据我了解,如果可能的话,最好使用否定的字符类,而不是使其变得懒惰。 I tried doing that but couldn't figure it out. 我尝试这样做,但无法弄清楚。 I don't even know if it's possible with this, but I thought I'd throw it out there. 我什至不知道这样做是否可行,但我想我会把它扔在那里。

The negated character class you are looking for is [^<>]* . 您要查找的否定字符类是[^<>]* That will skip any tag boundaries. 这将跳过任何标签边界。

 /\b(Word) (?! [^<>]*<\/a> | [^<]*>) \b/x

Note that looking for </a> will allow the regex to match should the link have further markup in it; 注意,如果链接中包含更多标记,寻找</a>将允许正则表达式匹配。 for example a bolded <a>..<b>Word</b>..</a> word would not be skipped. 例如,不会跳过加粗的<a>..<b>Word</b>..</a>单词。 (Checking for such things requires far more effort than a lookahead.) (检查这些事情需要比前瞻更多的工作。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM