I need a regex to match tokens for a syntax highlighter, which should match full words when surrounded by non-alphanumeric characters or string boundaries. The regex I initially came up with is:
(?<=[^\w]|^)TOKEN(?=[^\w]|$)
Where TOKEN
is the token I'm searching for. This works in regex testers, but c++'s regex doesn't support lookbehinds. Omitting the lookbehind causes the regex to match the character before the token as well, which causes issues. I'm aware boost::regex supports lookbehinds, but I'd like to keep to std::regex if possible.
My question is: can I change my regex to exclude the character before the token from the match?
The pattern is missing a closing ]
at the end, and \w
also matches \d
You might use an alternation asserting either the start of the string, or a position where \b
does not match and assert not a word char to the right.
(?:^|\B)TOKEN(?!\w)
After the update of the question, you can write (?<=[^\w]|^)TOKEN(?=[^\w]|$)
as (?<=\W|^)TOKEN(?=\W|$)
or in short without the lookbehind:
\bTOKEN(?!\w)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.