简体   繁体   中英

Need Regular Expression to match name/value pairs in different formats

I'm pulling in ASP/VBScript configuration files via PHP Curl to do some file processing and want to return some values.

The strings look like this:

config1 = ""  
config2 = "VALUE:1:0:9" 'strange value comment 
otherconfig = False 
yetanotherconfig = False 'some comment

Basically, its name value pairs separated by equal signs, with a value optionally enclosed within quotation marks followed optionally by a comment.

I want to return the actual VALUE (False, VALUE:1:0:9, etc..) in ONE matching group regardless of the format the string is in.

Here's the pattern i'm passing to preg_match so far:

$pattern = '/\s*'.$configname.'\s*\=\s*(\".*?\"|.*?\r)/'

$configname is the name of the specific configuration i'm looking for, so I pass it in with a variable.

I'm still getting parentheses included back with the value (instead of the value itself) and i'm getting comments returned with the value as well.

Any help is appreciated!

Returning matching value in ONE matching group if difficult because of the double quotes alternative. Back references can help:

$pattern = '/\s*'.$configname.'\s*=\s*("?)(?<value>.*?)\1\s*[\'$]/'

should do the trick. Then use $result['value'] .

Explained in english it does:

  • I skip the spaces identifier spaces = spaces (easy)
  • may match a " referenced as \\1 (the first capture parenthesis)
  • match any char not greedily referenced as value
  • match \\1 (so " if there was one before, or nothing if not)
  • may match some spaces
  • must match a starting comment ' or an end of line

Without back references:

$pattern = '/\s*'.$configname.'\s*=\s*(?:"(.*?)"|(.*?)\s*[\'$])/'

More efficient but value is in $result[1] or $result[2] .

Understand your mistakes:

  • You need \\ only to protect the string quote itself (here simple quote) or to avoid a preg reserved char to be interpreted (as . , ^ , $ ...)
  • End of line is marked as $ , not \\r or \\n
  • You never avoided the commentary

\\r is going to match a CR character (carriage return). You're essentially saying I want to match "???????" or ????????[carriage return]

Of course you'll get the apostrophe, you've matched it. You'll have to strip these things off.

patter = '/\\s*'.$configname.'\\s*\\=\\s*(\\")(.*?)(?(1)\\"|)\\s*/'

This one will work:

$pattern = '/
    \s*
    # name
    (?P<name>.*?)
    # =
    \s*=\s*
    # value
    (?P<val>
        "(?P<quoted>([^"]|\\\\"|\\\\\\\\)*)"
        |(?P<raw>.*?)
    )
    # comment
    \s*(?P<comment>\'.*)?
$/xm';

This will match every key=value pair in the input string, instead of just a specific one.

The regex takes care for quotes and escaped quotes ( \\" ) in quoted values (eg "foo\\"bar" ).

Use it with a function like this:

function parse_config($string) {
    $pattern = '/
        \s*
        # name
        (?P<name>.*?)
        # =
        \s*=\s*
        # value
        (?P<val>
            "(?P<quoted>([^"]|\\\\"|\\\\\\\\)*)"
            |(?P<raw>.*?)
        )
        # comment
        \s*(?P<comment>\'.*)?
    $/xm';

    preg_match_all($pattern, $string, $matches, PREG_SET_ORDER);

    $config = array();
    foreach($matches as $match) {
        $name = $match['name'];
        if (!empty($match['quoted'])) {
            $value = str_replace(array('\\"','\\\\'), array('"','\\'), $match['quoted']);
        } else if (isset($match['raw'])) {
            $value = $match['raw'];
        } else {
            $value = '';
        }
        $config[$name] = $value;
    }

    return $config;
}

Example:

$string = "a = b\n
c=\"d\\\"e\\\\fgh\" ' comment";

$config = parse_config($string);

// output:

array('a' => 'b', 'c' => 'd"e\fgh');

Other example:

$string = <<<EOF
config1 = ""
config2 = "VALUE:1:0:9" 'strange value comment
otherconfig = False
yetanotherconfig = False 'some comment
EOF;

print_r(parse_config($string));

// output:

Array
(
    [config1] => 
    [config2] => VALUE:1:0:9
    [otherconfig] => False
    [yetanotherconfig] => False
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM