I think I've got this working for the most part, but was wondering if there is a better way to write it:
/\b(Word)(?!.*?<\/a>)(?!.*?>)\b/
I'm trying to match Word when it's NOT linked, and it's NOT part of HTML tags (like <a href="" title="Word">
should not match).
From what I understand, it's better to use negated character classes if possible rather than making it lazy. I tried doing that but couldn't figure it out. I don't even know if it's possible with this, but I thought I'd throw it out there.
The negated character class you are looking for is [^<>]*
. That will skip any tag boundaries.
/\b(Word) (?! [^<>]*<\/a> | [^<]*>) \b/x
Note that looking for </a>
will allow the regex to match should the link have further markup in it; for example a bolded <a>..<b>Word</b>..</a>
word would not be skipped. (Checking for such things requires far more effort than a lookahead.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.