编写此正则表达式的更好方法？负前瞻

Question

I think I've got this working for the most part, but was wondering if there is a better way to write it: 我想我大部分时间都在工作，但是想知道是否有更好的方法可以编写它：

/\b(Word)(?!.*?<\/a>)(?!.*?>)\b/

I'm trying to match Word when it's NOT linked, and it's NOT part of HTML tags (like <a href="" title="Word"> should not match). 我试图匹配未链接的Word，并且它不是HTML标记的一部分（例如<a href="" title="Word">不匹配）。

From what I understand, it's better to use negated character classes if possible rather than making it lazy. 据我了解，如果可能的话，最好使用否定的字符类，而不是使其变得懒惰。 I tried doing that but couldn't figure it out. 我尝试这样做，但无法弄清楚。 I don't even know if it's possible with this, but I thought I'd throw it out there. 我什至不知道这样做是否可行，但我想我会把它扔在那里。

Answer 1

The negated character class you are looking for is [^<>]* . 您要查找的否定字符类是[^<>]* 。 That will skip any tag boundaries. 这将跳过任何标签边界。

 /\b(Word) (?! [^<>]*<\/a> | [^<]*>) \b/x

Note that looking for </a> will allow the regex to match should the link have further markup in it; 注意，如果链接中包含更多标记，寻找</a>将允许正则表达式匹配。 for example a bolded <a>..<b>Word</b>..</a> word would not be skipped. 例如，不会跳过加粗的<a>..<b>Word</b>..</a>单词。 (Checking for such things requires far more effort than a lookahead.) （检查这些事情需要比前瞻更多的工作。）

编写此正则表达式的更好方法？负前瞻

问题描述

1 个解决方案

解决方案1
1 已采纳 2011-10-01 15:26:34

编写此正则表达式的更好方法？ 负前瞻

问题描述

1 个解决方案

解决方案1 1 已采纳 2011-10-01 15:26:34

编写此正则表达式的更好方法？负前瞻

解决方案1
1 已采纳 2011-10-01 15:26:34