简体   繁体   English

Python:用于匹配 C 代码中的多行字符串的正则表达式

[英]Python: Regex to match multiline strings in C code

I'm attempting to match multiline strings in C code via the re module.我试图通过re模块匹配 C 代码中的多行字符串。

I'd like to match strings of the form:我想匹配以下形式的字符串:

char * theString = "Some string \
                   I want to match.";

I tried the following regex, which does not work:我尝试了以下正则表达式,但它不起作用:

regex = re.compile(r"\".*\"$", re.MULTILINE)

I thought that it would match the first ", then continue searching the next line until it found a closing ", but this is not the case.我以为它会匹配第一个“,然后继续搜索下一行,直到找到一个结束的”,但事实并非如此。 Is this because $ requires that there be a " at the end of the line to match? Is there some way to do this using regex?这是因为 $ 要求在行尾有一个 " 来匹配吗?有没有办法使用正则表达式来做到这一点?

Use dot all flag.使用点所有标志。

However, this is the way to parse C strings.然而,这是解析 C 字符串的方式。 (?s)"[^"\\]*(?:\\.[^"\\]*)*"

if it doesn't support (?s) inline modifier, set the modifier in the flags parameter.如果它不支持(?s)内联修饰符,请在 flags 参数中设置修饰符。

re.compile(r'"[^"\\]*(?:\\.[^"\\]*)*"', re.DOTALL)

 (?s)
 "
 [^"\\]*                       # Double quoted text
 (?: \\ . [^"\\]* )*
 "

Ideally, you should add (raw regex) (?<?\\)(::\\\\)* at the beginning,理想情况下,您应该在开头添加 (raw regex) (?<?\\)(::\\\\)*
to make sure the opening double quote is not escaped.确保不转义开头的双引号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM