简体   繁体   English

改进preg / pcre / regex以查找PHP变量

[英]improve preg / pcre / regex to find PHP variable

String to parse: 要解析的字符串:

$str = "
public   $xxxx123;
private  $_priv   ;
         $xxx     = 'test';
private  $arr_123 = array();
"; //    |       |
   //     ^^^^^^^---- get the variable name

What I got so far: 到目前为止所得到的

    $str = preg_match_all('/\$\S+(;|[[:space:]])/', $str, $matches);
    foreach ($matches[0] as $match) {
        $match = str_replace('$', '', $match);
        $match = str_replace(';', '', $match);
     }

It works but I want to know if I can improve the preg , eg get rid of the two str_replace and maybe include \\t in (;|[[:space:]]) 它有效,但是我想知道我是否可以改善预浸料坯 ,例如摆脱两个str_replace并在(;|[[:space:]])加入\\t

Using a positive lookbehind, you can get only that what you need, to be sure you'll only match valid variable names, I've used this: 使用正回顾后,你只能得到你需要什么,相信你一定会只匹配有效的变量名,我用这个:

preg_match_all('/(?<=\$)[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/',$str,$matches);
var_dump($matches);

which correctly shows: 正确显示:

array (
  0 => 
  array (
    0 => 'xxxx123',
    1 => '_priv',
    2 => 'xxx',
    3 => 'arr_123'
  )
)

Which is all you need, no memory waisted on an array containing all variables with their leading and/or trailing chars. 这就是您所需要的,在包含所有带有前导和/或尾随字符的变量的数组上没有多余的内存。

The expression: 表达方式:

  • (?<=\\$) is a positive lookbehind (?<=\\$)是令人反感的
  • [a-zA-Z_\\x7f-\\xff][a-zA-Z0-9_\\x7f-\\xff]* : is the regex PHP's site suggests themselves on their document pages [a-zA-Z_\\x7f-\\xff][a-zA-Z0-9_\\x7f-\\xff]* :正则表达式PHP的网站在其文档页面上建议自己

simply use backreferences 只需使用反向引用

preg_match_all('/\$(\S+?)[;\s=]/', $str, $matches);
foreach ($matches[1] as $match) {

     // $match is now only the name of the variable without $ and ;
}

I changed the regex a little bit, take a look: 我稍微修改了正则表达式,看看:

$str = '
public   $xxxx123;
private  $_priv   ;
         $xxx     = "test";
private  $arr_123 = array();
';

$matches = array();

//$str = preg_match_all('/\$(\S+)[; ]/', $str, $matches);
$str = preg_match_all('/\$(\S+?)(?:[=;]|\s+)/', $str, $matches); //credits for mr. @booobs for this regex

print_r($matches);

The output: 输出:

Array
(
    [0] => Array
        (
            [0] => $xxxx123;
            [1] => $_priv 
            [2] => $xxx 
            [3] => $arr_123 
        )

    [1] => Array
        (
            [0] => xxxx123
            [1] => _priv
            [2] => xxx
            [3] => arr_123
        )

)

Now you can use the $matches[1] in the foreach loop. 现在,您可以在foreach循环中使用$matches[1]

::Update:: ::更新::

After using regex "/\\$([a-zA-Z_\\x7f-\\xff][a-zA-Z0-9_\\x7f-\\xff]*)/" the output looks correct. 使用正则表达式“ / \\ $([a-zA-Z_ \\ x7f- \\ xff] [a-zA-Z0-9_ \\ x7f- \\ xff] *)/”之后,输出看起来正确。

String: 串:

$str = '
public   $xxxx123; $input1;$input3
private  $_priv   ;
         $xxx     = "test";
private  $arr_123 = array();

'; ';

And the output: 并输出:

Array
(
    [0] => Array
        (
            [0] => $xxxx123
            [1] => $input1
            [2] => $input3
            [3] => $_priv
            [4] => $xxx
            [5] => $arr_123
        )

    [1] => Array
        (
            [0] => xxxx123
            [1] => input1
            [2] => input3
            [3] => _priv
            [4] => xxx
            [5] => arr_123
        )

)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM