简体   繁体   English

正则表达式匹配PHP变量和字符串连接

[英]Regular Expressions to match PHP variable and string concatenation

I am trying to build a regular expression which matches different types of echo statements.... the word echo has already been match.. 我正在尝试构建一个匹配不同类型的echo语句的正则表达式....单词echo已匹配..

Example patterns to be matched 要匹配的示例模式

"hiii";
"how"."are"."you";
$var."abc";
"abc".$var;
'how'."how".$var;

pattern for var var的模式

/^[a-zA-Z_][a-zA-Z0-9_]*/

I already have a pattern to match first 2 patterns... 我已经有一个匹配前2个模式的模式......

/((^"[^"]*"\.{0,1})*;)/

Regular expressions aren't a solution for everything. 正则表达式不是一切的解决方案。 For example, in this case it's easily noticeable you want to parse PHP code. 例如,在这种情况下,您很容易注意到要解析PHP代码。 Just like you shouldn't parse HTML with regex, you shouldn't parse PHP with regex. 就像你不应该用正则表达式解析HTML一样,你不应该用正则表达式解析PHP。

Instead, use PHP's tokenizer , which can be used to parse PHP expressions. 相反,使用PHP的tokenizer ,它可用于解析PHP表达式。

Next to the two given suggestions, if you're looking for PHP PCRE based regexes to validate a subset of PHP, this can be done more structured by specifying named subpatterns for the tokens you're looking for. 在两个给定的建议旁边,如果您正在寻找基于PHP PCRE的正则表达式来验证PHP的子集,可以通过为您正在寻找的标记指定命名子模式来更加结构化。 Here is an exemplary regular expression pattern that's looking for these patterns even allowing whitespace around (as PHP would do) for any us-ascii based extended single-byte charsets (I think this is how PHP actually treats it even if it's UTF-8 in your files): 这是一个示例性正则表达式模式,它正在寻找这些模式甚至允许任何基于us-ascii的扩展单字节字符集的空白(如PHP所做)(我认为这就是PHP实际上如何对待它,即使它是UTF-8 in你的文件):

~
(?(DEFINE)
    (?<stringDoubleQuote> "(?:\\"|[^"])+")
    (?<stringSingleQuote> '(?:\\'|[^'])+')
    (?<string> (?:(?&stringDoubleQuote)|(?&stringSingleQuote)))
    (?<variable> \\\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*))
    (?<varorstring> (?:(?&variable)|(?&string)))
)
^ \s* (?&varorstring) (?: \s* \. \s* (?&varorstring) )* \s* ; $
~x

Thanks to the named subpatterns it's easy to use a token for any string or variable and add the whitespace handling and string concatenating operator. 由于命名的子模式,很容易为任何字符串或变量使用标记,并添加空格处理和字符串连接运算符。 Such assigned to $pattern , an example of use is: 这样分配给$pattern ,一个使用的例子是:

$lines = <<<'LINES'
"hiii";
"how"."are"."you";
$var."abc";
"abc".$var;
'how'."how".$var;
LINES;    

foreach (explode("\n", $lines) as $subject) {
    $result = preg_match($pattern, $subject);
    if (FALSE === $result) {
        throw new LogicException('PCRE pattern did not compile.');
    }
    printf("%s %s match.\n", var_export($subject, true), $result ? 'did' : 'did not');
}

Output: 输出:

'"hiii";' did match.
'"how"."are"."you";' did match.
'$var."abc";' did match.
'"abc".$var;' did match.
'\'how\'."how".$var;' did match.

Demo: https://eval.in/142721 演示: https//eval.in/142721

Related 有关

You can do that with the following regex without needing to use recursion: 您可以使用以下正则表达式执行此操作,而无需使用递归:

^"[^"]+"(\\."[^"]+")*;$

Demo: http://regex101.com/r/oW5zH4 演示: http//regex101.com/r/oW5zH4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM