Right now I am using [^ \\\\&<>|\\t\\n]+
which will match any string that contains characters that are not a space, \\, &, <, >, |, \\t, \\n. What I want to do is also allow you to escape any of these special characters so that (for example) \\< or \\& would still allow my entire string to be matched.
Should match:
abcdefghijk abcdef\\&hdehud\\<jdow\\\\
Should not match:
abcdefhfh&kdjeid abcdjedje\\idwjdj
I found this pattern ([^\\[]|(?<=\\\\)\\[)+
which does the same thing for just the "[" character. I couldn't figure out how to extend this to apply to any additional characters.
Any idea how I can make the exception for characters preceded by a backslash?
If it makes any difference, I'm using this in Flex and C++ to tokenize a string for a shell. I believe I need to use negative look-behinds but I don't know how to do that with multiple characters.
You are already most of the way to the answer:
You are using the negated set [^ \\\\&<>|\\t\\n]
to specifiy which characters may not be present, so all you have to do is then use the same set without the negation preceded by a \\
to escape the character. That gets you this \\\\[ \\\\&<>|\\t\\n]
which can be read as "a \\
followed by any one of the items in the set" now combine the two and you get ([^ \\\\&<>|\\t\\n]|\\\\[ \\\\&<>|\\t\\n])+
.
To break it down:
One or more of: [^ \\\\&<>|\\t\\n]
or \\\\[ \\\\&<>|\\t\\n]
As usual, using a regular expression here is overkill. This is a simple text search:
const std::string target = "\\&<>|";
std::string iter = str.find_first_of(target);
while (iter != str.end()) {
if (*iter != '\\')
found_bad_character(*iter);
iter = str.find_first_of(target, std::next(iter));
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.