简体   繁体   English

如何创建正则表达式以匹配函数定义

[英]How to create regular expression to match function definitions

I need to find function definitions like 我需要找到类似的函数定义

function (param1, param2, param3) 函数(param1,param2,param3)

I am using the following regular expression in python 我在python中使用以下正则表达式

\S+\\((\S+|\s+|,)\\)

so that something like 这样的东西

re.findall("\S+\\((\S+|\s+|,)\\)",source_code_string)

should give me the all the function names, but it's not working. 应该给我所有的函数名,但是不起作用。 Please suggest improvements to the above regular expression. 请提出对上述正则表达式的改进。 I am new to regular expressions. 我是新来的正则表达式。

Your regex is fundamentally wrong 您的正则表达式根本上是错误的

\S+\\((\S+|\s+|,)\\)

does mean match at least one non-whitespace, a bracket then a series of non-whitespace OR a series of whitspace OR a comma and then the closing bracket. 确实意味着要匹配至少一个非空白,一个方括号,然后匹配一系列非空白或一系列惠特空格或一个逗号,再匹配一个右方括号。

I think what you meant was this (use raw strings (r'') and escape only once) 我认为您的意思是(使用原始字符串(r'')并仅转义一次)

(\S+)\s*\(\s*\S+\s*(?:,\s*\S+)*\)

See it here on Regexr 在Regexr上查看

You can then find the name of your function in the capturing group 1 (because of the brackets around the first \\S+ ) 然后,您可以在捕获组1中找到函数的名称(因为第一个\\S+括在括号中)

The \\s* are optional whitespaces \\s*是可选的空格

BUT this regex is so simple, I am sure it will not find all functions (it will fail on nested brackets) and it will find other stuff. 但是这个正则表达式是如此简单,我确信它不会找到所有函数(它将在嵌套括号中失败)并且它将找到其他东西。

The answer is going to depend on what language the source files are written in. Recall that in Python, function definitions are prefixed by def and suffixed by : . 答案将取决于源文件使用哪种语言编写。回想一下,在Python中,函数定义以def为前缀,并以:后缀。 Expanding on Stema's answer, try this for Python: 扩展Stema的答案,为Python尝试一下:

^\\s*def (\\S+)\\s*\\(\\s*\\S+\\s*(?:,\\s*\\S+)*\\):$

This should match only Python function definitions. 这应该只匹配Python函数定义。 The ^ and $ match only at the beginning and end of the line, respectively, so this will only find function defs on their own line, as they usually are for Python. ^$仅在行的开头和结尾匹配,因此,这将只能在它们自己的行中找到函数def,就像通常在Python中一样。

It's not exactly clear what you are looking for, but consider a few things. 目前尚不清楚您在寻找什么,但请考虑一些事项。

  • \\w+ will match any word, which can contain letters, numbers, underscores, and most other unicode word-like characters \\w+将与任何单词匹配,该单词可以包含字母,数字,下划线和大多数其他Unicode类似于单词的字符

  • Using a raw string when dealing with python regex's is preferred, as you don't have to escape backslashes. 在处理python regex时,首选使用原始字符串,因为您不必转义反斜杠。 This means that you need to prefix every regex pattern with an r, like r'this' . 这意味着您需要在每个正则表达式模式前加上r前缀,例如r'this' Otherwise, to match a literal backslash, you need to use \\\\\\\\ 否则,要匹配文字反斜杠,您需要使用\\\\\\\\

  • When in doubt, check the library docs , or another source on regex's. 如有疑问,请查看库docs或正则表达式的其他来源

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM