Better way to write this Regex? Negative Lookahead

Question

I think I've got this working for the most part, but was wondering if there is a better way to write it:

/\b(Word)(?!.*?<\/a>)(?!.*?>)\b/

I'm trying to match Word when it's NOT linked, and it's NOT part of HTML tags (like <a href="" title="Word"> should not match).

From what I understand, it's better to use negated character classes if possible rather than making it lazy. I tried doing that but couldn't figure it out. I don't even know if it's possible with this, but I thought I'd throw it out there.

Answer 1

The negated character class you are looking for is [^<>]* . That will skip any tag boundaries.

 /\b(Word) (?! [^<>]*<\/a> | [^<]*>) \b/x

Note that looking for </a> will allow the regex to match should the link have further markup in it; for example a bolded <a>..<b>Word</b>..</a> word would not be skipped. (Checking for such things requires far more effort than a lookahead.)

Better way to write this Regex? Negative Lookahead

Question

1 answers

solution1
1 ACCPTED 2011-10-01 15:26:34

Better way to write this Regex? Negative Lookahead

Question

1 answers

solution1 1 ACCPTED 2011-10-01 15:26:34

solution1
1 ACCPTED 2011-10-01 15:26:34