简体   繁体   中英

Regular Expression pattern to match if statement in C#?

I am using the following Regular Expression pattern to match an if statement written in C# style;

\b[if]{2}\b[ ]?\({1}(?<HeaderSection>[ \w\s\a\t\=\.\@\#\$\%\&a-zA-Z0-9\(\)\;\/\"\'\[\]\*]*)\){1}(?<CommentSection>[\s\a\w\t a-zA-Z0-9\/\.]*)[\r\n]*\{{1}(?<FunctionBody>[\r\n \a\s\wa-zA-Z0-9\(\)\"\.\;\:]*)[\r\n]*\}{1}

Its a crazy long regex pattern but seems to be working to some extent.Let me explain it,it has three named capturing Groups namely HeaderSection , CommentSection and FunctionBody .HeaderSection captures match between starting and closing parentheses of if statement,such as from the statement below;

if(Value1==Function(int Z))

it captures ;

Value1==Function(int Z)

Similarly CommentSection captures comment(if any) after the closing parentheses,so from the statement below ;

if(Value1==Function(int Z))//This is a Comment.

it captures

//This is a Comment.

and FunctionBody captures anything between { and },such as in the code below ;

if(Value1==Function(int Z))//This is a Comment.
{
  This is the
  space for
  function body.
}

it captures "This is the space for function body." So that was explanation of what the regex matches.Now the issue with it is that if i have some function like this;

if(Value1==Function(int Z)//This is a Comment.
{
  if(Value2==Value1)
  {
    Some code
  }
}

and if i match it using the regex above it doesn't match the first if declaration ie;

if(Value1==Function(int Z)//This is a Comment.
{
Another function();
}

and instead matches the inner one ie

  if(Value2==Value1)
  {
    Some code
  }

Please point what i have done wrong,or if there is another way that is less messy please let me know,or correct the regex pattern if its wrong somewhere.One more thing i'm doing all this in C# using Regular Expression functions. Thanks in advance.

(?<header>if\(.*?)(?<comment>//.*?)*\s\n\{(?<functionbody>.*?)\n\}

this seems to be a solution if the paran is formated in the supposed way.

(?<header>if\(.*?)

will match if( followed by anything BUT before the // section, so it will match

if(Value1==Function(int Z))

then it moves on to the (?<comment>//.*?)*\\s that will match anything following the // signs BUT will also match if there is nothing * equals zero or more occurences, and the \\s makes sure that it doesnt go beyond the line end.

then (\\n\\{)(?<functionbody>.*?)(\\n\\}) matches any { just after a newline and progresses until a } is found just after a newline.

in

var x = 0
if(Value1==Function(int Z))//This is a Comment.
{
  if(Value2==Value1)
  {
    Some code
  }
}
var y = 0

if(y == x) 
{
    x = y + 1
}

it will match the following groups :

header: if(Value1==Function(int Z))
comment: //This is a Comment.
functionbody: 
  if(Value2==Value1)
  {
    Some code
  }

header: if(y == x) 
functionbody: 
        x = y + 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM