I have a string :
my $string = "name_of_my_function(arg1,arg2,[arg3,arg4])";
and I want to extract the name of the function "name_of_my_function" and the parameters :
$arg1 = "arg1"
$arg2 = "arg2"
@arg_list = ("arg3", "arg4")
the code I use to extract the function is :
$row =~ m/^([^\(]*)\(([^\)]*)\)/;
$function = $1;
However, it works when the string doesn't have any "]"
, for example :
my $string = "name_of_my_function(arg1,arg2,arg3)";
but it doesn't return anything when there is a "]"
Any idea?
Thanks,
SLP
The regex you show captures the function name, and all other arguments in a string, which is a very reasonble first step. Then parse the arguments out of that second string. I expand your $string
so to have multiple bracketed lists of arguments, interleaved with non-bracketed ones
perl -wE'
$s = "name_of_my_function(arg1,arg2,[arg3,arg4],arg5,[arg6,arg7])";
@m = $s =~ /^([^\(]*)\(([^\)]*)\)/;
@p = grep { $_ } split /\s*,\s*|\[(.*?)\]/, $m[1];
for (@p) {
if (/,/) { push @arg_list, $_ }
else { push @args, $_ }
}
say $m[0];
say for @args;
say for @arg_list
'
This prints
name_of_my_function arg1 arg2 arg5 arg3,arg4 arg6,arg7
The split
is where individual arguments are extracted, as well as bracketed argument list(s), each as a string. That may return empty elements thus grep { $_ }
to filter them out.
Then you can proceed to extract individual arguments from lists that were in brackets, by splitting each string in @arg_list
by ,
again.
The main part of the above can , as the problem stands, go in one statement
@p = grep { $_ } split /\( | \) | \[(.*?)\] |,/x, $s;
where I added /x
modifier so to be able to space it out for readability. This delivers to @p
the function name, individual arguments, and a string with (comma separated) argument list from each []
.
However, I think that it is far more sensible to break this up into several steps.
Well, if the number of arguments is variable, that is not that simple to do it with rgex only (arguments will be matched with +
quantifier, so they won't be stored in capturing group, which would be easy to extract). Having in mind the above, you could use this pattern (\\w+)\\(((\\w+|\\[(\\w+,?)+\\]),?)+\\)
Explanation:
(\\w+)
- match one or more word characters (name of a function) and store it in first capturing group,
(\\w+|\\[(\\w+,?)+\\])
- alternation: match \\w+
(same as above) or \\[(\\w+,?)+\\]
: \\[
- match [
literally, (\\w+,?)+
- match on or more times \\w+,
pattern which is one or more word characters followed by one or zero commas ( ,?
), \\]
- match ]
literally,
((\\w+|\\[(\\w+,?)+\\]),?)+
- match whole above pattern, optionally followed by comma ( ,?
) one or more times. This would match argument list.
\\(
, \\)
0 match (
, )
literally
Further processing - extract whats between brackets ()
in order to extract arguments list programatically - it would be easier that doing it with complex regular expression
UPDATE :
Try pattern: https://regex101.com/r/wBcJZ0/3
I omitted explanation, as it is very similair to previous pattern.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.