简体   繁体   English

如何使用正则表达式检测无效的C转义字符串?

[英]How to detect an invalid C escaped string using a regular expression?

I would like to find a regular expression (regex) that does detect if you have some invalid escapes in a C double quoted escaped string (where you can find double quotes only escaped). 我想找到一个正则表达式(regex),该正则表达式可以检测C双引号转义字符串中是否存在一些无效的转义符(您可以在其中找到仅转义的双引号)。

I consider valid \\\\ \\n \\r \\" (the test string is using ") 我认为有效\\\\ \\n \\r \\" (测试字符串使用的是“)

A partial solution to this is to use (?<!\\\\)\\\\[^\\"\\\\nr] but this one fails to detect bad escapes like \\\\\\ . 此问题的部分解决方案是使用(?<!\\\\)\\\\[^\\"\\\\nr]但是此方法无法检测到\\\\\\类的错误转义\\\\\\

Here is a test string that I use to test the matching: 这是我用来测试匹配项的测试字符串:

...\\n...\\\\b...\\"...\\\\\\\\...\\\\\\E...\\...\\\\\\...\\\\\\\\\\..."...\\E...

The expression should match the last 6 blocks as invalid, the first 4 are valid. 该表达式应与后6个块匹配为无效,前4个有效。 The problem is that my current version does find only 2/5 errors. 问题是我的当前版本确实仅发现2/5错误。

(?:^|[^\\])(?:\\\\)*((?:\"|\\(?:[^\"\\nr]|$)))

That's the start of a string, or something that's not a backslash. 那是一个字符串的开始,或者不是反斜杠。 Then some (possibly zero) properly escaped backslashes, then either an unescaped " or another backslash; if it's another backslash, it must be followed by something that is neither " , \\ , n , nor r , or the end of the string. 然后,一些(可能为零)正确转义的反斜杠,然后是未转义的"或另一个反斜杠;如果是另一个反斜杠,则必须后面跟不是"\\nr或字符串的末尾。

The incorrect escape is captured for you as well. 也会为您捕获不正确的转义。

Try this regular expression: 试试这个正则表达式:

^(?:[^\\]+|\\[\\rn"])*(\\(?:[^\\rn"]|$))

If you have a match, you have an invalid escape sequence. 如果匹配,则转义序列无效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM