简体   繁体   中英

Better way to write this Regex? Negative Lookahead

I think I've got this working for the most part, but was wondering if there is a better way to write it:

/\b(Word)(?!.*?<\/a>)(?!.*?>)\b/

I'm trying to match Word when it's NOT linked, and it's NOT part of HTML tags (like <a href="" title="Word"> should not match).

From what I understand, it's better to use negated character classes if possible rather than making it lazy. I tried doing that but couldn't figure it out. I don't even know if it's possible with this, but I thought I'd throw it out there.

The negated character class you are looking for is [^<>]* . That will skip any tag boundaries.

 /\b(Word) (?! [^<>]*<\/a> | [^<]*>) \b/x

Note that looking for </a> will allow the regex to match should the link have further markup in it; for example a bolded <a>..<b>Word</b>..</a> word would not be skipped. (Checking for such things requires far more effort than a lookahead.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM