简体   繁体   中英

Python: Regex expression to detect function name

I have the following simple example where I try to find what is trapped between a space " ", and a pair of braces with some text within "(blablalba)".

But I'm only interested in the text in that area, so: -" Orange Apple Mango(BlaBlaBla)" -" Apple Mango(Blablabla)" -" Mango(BlaBlaBla)" Should all return "Mango"

import re


txt = "extern void Init(blabla);"
x = re.findall('\s(.*?)\(.*?\);',txt)

#expected output: "Init"
#returned output  "void Init"
print(x)

Thanks in advance.

I would use \S (ie everything but whitespace) in place of . (ie everything but newline) inside capturing group that is:

import re
txt = "extern void Init(blabla);"
x = re.findall('\s(\S*?)\(.*?\);',txt)
print(x)  # ['Init']

If you know to what langauge spec adhere your text, you might be more precise and specify only character legal in function names. For example if only uppercase ASCII is allowed you might use [AZ]*? in place of \S*? and so on.

Your pattern matches more than you need because . matches any char but line break chars.

You may use \w that is usually used to match identifiers only:

r'(?<!\S)(\w+)\(.*?\);'

See the regex demo

See the (?<!\S) part: it is a left-hand whitespace boundary, and will enable matches also at the start of a string

Pattern details

  • (?<!\S) - start of string or whitespace must be present immediately to the left of the current location
  • (\w+) - Group 1: 1+ word chars
  • \( - a ( char
  • .*? - 0+ chars other than line break chars, as few as possible - \); - ); string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM