简体   繁体   中英

Parsing parameters from command line with RegEx and PHP

I have this as an input to my command line interface as parameters to the executable:

-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"

What I want to is to get all of the parameters in a key-value / associative array with PHP like this:

$result = [
    'Parameter1' => '1234',
    'Parameter2' => '1234',
    'param3' => 'Test \"escaped\"',
    'param4' => '10',
    'param5' => '0',
    'param6' => 'TT',
    'param7' => 'Seven',
    'param8' => 'secret',
    'SuperParam9' => '4857',
    'SuperParam10' => '123',
];

The problem here lies at the following:

  • parameter's prefix can be - or --
  • parameter's glue (value assignment operator) can be either an = sign or a whitespace ' '
  • some parameters may be inside a quote block and can also have different, both separators and glues and prefixes, ie. a ? mark for the separator.

So far, since I'm really bad with RegEx, and still learning it, is this:

/(-[a-zA-Z]+)/gui

With which I can get all the parameters starting with an - ...

I can go to manually explode the entire thing and parse it manually, but there are way too many contingencies to think about.

You can try this that uses the branch reset feature (?|...|...) to deal with the different possible formats of the values:

$str = '-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"';

$pattern = '~ --?(?<key> [^= ]+ ) [ =]
(?|
    " (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) "
  |
    ([^ ?"]*)
)~x';

preg_match_all ($pattern, $str, $matches);
$result = array_combine($matches['key'], $matches['value']);
print_r($result);

demo

In a branch reset group, the capture groups have the same number or the same name in each branch of the alternation.

This means that (?<value> [^\\\\\\\\"]*+ (?s:\\\\\\\\.[^\\\\\\\\"]*)*+ ) is (obviously) the value named capture, but that ([^ ?"]*) is also the value named capture.

You could use

--?
(?P<key>\w+)
(?|
    =(?P<value>[^-\s?"]+)
    |
    \h+"(?P<value>.*?)(?<!\\)"
    |
    \h+(?P<value>\H+)
)

See a demo on regex101.com .


Which in PHP would be:

 <?php $data = <<<DATA -Parameter1=1234 -Parameter2=38518 -param3 "Test \\"escaped\\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123" DATA; $regex = '~ --? (?P<key>\\w+) (?| =(?P<value>[^-\\s?"]+) | \\h+"(?P<value>.*?)(?<!\\\\\\\\)" | \\h+(?P<value>\\H+) )~x'; if (preg_match_all($regex, $data, $matches)) { $result = array_combine($matches['key'], $matches['value']); print_r($result); } ?> 


This yields

 Array ( [Parameter1] => 1234 [Parameter2] => 38518 [param3] => Test \\"escaped\\" [param4] => 10 [param5] => 0 [param6] => TT [param7] => Seven [param8] => secret [SuperParam9] => 4857 [SuperParam10] => 123 ) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM