简体   繁体   English

一个匹配 php 的正则表达式 class 的开始和结束字符里面有什么

[英]a regex to match a php what's inside the start and end char of a class

I'm having trouble finding the regex that matches the start and end chars of a php class, which are { and } respectively.我无法找到与 php class 的开始和结束字符匹配的正则表达式,它们分别是 { 和 }。 The regex should also not match the { and } if they are inside php comments, in other words it should not match if the { or } is preceded by any char but whitespace.如果 { 和 } 在 php 注释中,则正则表达式也不应该匹配,换句话说,如果 { 或 } 前面有任何字符但空格,则它不应该匹配。

I suppose I should use negative look behind, but I'm a little rusty on regex, and so far I didn't found the solution.我想我应该在后面使用负面的看法,但我对正则表达式有点生疏,到目前为止我还没有找到解决方案。

Here is my test string:这是我的测试字符串:

<?php


namespace Ling\Light_TaskScheduler\Service;


/**
 * The LightTaskSchedulerService class. :{
 */
class LightTaskSchedulerService
{

    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");


    }


}


// this can happen in comments: }, why
// more stuff







And my pattern, which doesn't work at the moment, is this:我的模式,目前不起作用,是这样的:

    if(preg_match('!^\s*\{\s*(.*)(?<![^\s]*)\}!ms', $c, $match)){
        a($match);
    }

So, I used multiline modifier "m", since we need to parse a multiline string, then I used the "s" modifier so that the dot matches line breaks, but then the negative look behind part (?<.[^\s]*) doesn't seem to work.所以,我使用了多行修饰符“m”,因为我们需要解析一个多行字符串,所以我使用了“s”修饰符,以便点匹配换行符,但是后面的否定看起来部分 (?<.[^\s ]*) 似乎不起作用。 I'm basically trying to say don't match the "}" char if it's preceded by anything but a whitespace.我基本上想说不匹配“}”字符,如果它前面有除空格以外的任何内容。

@Wiktor Stribiżew: I tried this pattern but it still doesn't work: .^\s*\{\s*(?*)(?<!\S)\}!ms @Wiktor Stribiżew:我尝试了这种模式,但它仍然不起作用: .^\s*\{\s*(?*)(?<!\S)\}!ms

Considering Tim Biegeleisen's comment, I'll probably take a simpler approach, like removing the comments first, and then do the simpler regex .^\s*\{\s*(.*)\}!ms , which I know will work.考虑到 Tim Biegeleisen 的评论,我可能会采取更简单的方法,例如先删除评论,然后执行更简单的正则表达式.^\s*\{\s*(.*)\}!ms ,我知道这会起作用.

However, if somebody knows a regex that does it, I would be interested in seeing it.但是,如果有人知道这样做的正则表达式,我会有兴趣看到它。

Problem solved for now, I'm out, thanks guys.问题暂时解决了,我出去了,谢谢大家。

@Wiktor Stribiżew @Wiktor Stribiżew

The weird thing is that your regex works on the regex101 website, but it doesn't work in my version of php (PHP 7.2.31).奇怪的是,您的正则表达式在 regex101 网站上工作,但在我的 php (PHP 7.2.31) 版本中不起作用。

So I mean: this doesn't work in my php world:所以我的意思是:这在我的 php 世界中不起作用:

$c = <<<'EEE'
<?php

/**
 * The LightTaskSchedulerService class. :{
 */
class LightTaskSchedulerService
{

    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");


    }


}


// this can happen in comments: }, why
// more stuff


EEE;



if(preg_match('/^\s*\{\s*(.*)(?<!\S)\}$/gms', $c, $match)){
    echo "a match was found"; // is never displayed
}
exit;

So I don't know what regex101 is using under the hood, but doesn't work for me.所以我不知道 regex101 在后台使用什么,但对我不起作用。

UPDATE更新

As Tim suggested, regex might not be the most appropriate tool for this job.正如 Tim 所建议的,正则表达式可能不是最适合这项工作的工具。

I ended up using a very simple solution to find the end character, and something similar can be applied to find the start character:我最终使用了一个非常简单的解决方案来查找结束字符,并且可以应用类似的方法来查找开始字符:

    /**
     * Returns an array containing information related to the end of the class.
     *
     * Important note, this method assumes that:
     *
     * - the parsed php file contains valid php code
     * - the parsed php file contains only one class
     *
     * If either the above assumptions are not true, then this method won't work properly.
     *
     *
     *
     * The returned array has the following structure:
     *
     *
     * - endLine: int, the number of the line containing the class declaration's last char
     * - lastLineContent: string, the content of the last line being part of the class declaration
     *
     *
     * @return array
     */
    public function getClassLastLineInfo(): array
    {

        $lastLineNumber = null;
        $lastLineContent = null;


        $lines = file($this->file);
        $reversedLines = array_reverse($lines);
        foreach ($reversedLines as $k => $line) {
            if ('}' === trim($line)) {
                $n = count($lines);
                $lastLineNumber = $n - $k;
                $lastLineContent = $line;
                break;
            }
        }

        return [
            "endLine" => $lastLineNumber,
            "lastLineContent" => $lastLineContent,
        ];
    }

With something similar for the start char, we basically can obtain the line numbers of the start and end characters of the class, and armed with those, we can simply get all the lines of the string as an array, and use a combination of array_slice/implode to "recompile" the content of the class.有了类似的起始字符,我们基本上可以得到 class 的开始和结束字符的行号,有了这些,我们可以简单地将字符串的所有行作为一个数组,并使用 array_slice 的组合/implode “重新编译” class 的内容。

Anyway, thanks for the comments.无论如何,感谢您的评论。

UPDATE更新

As people have already stated in the comment section: Regex might not be the best solution to do this.正如人们在评论部分已经指出的那样:Regex 可能不是做到这一点的最佳解决方案。 Anyway, you asked for it and I tested it with the class below.无论如何,你要求它,我用下面的 class 测试它。

// 1) without class check -> this does not work with code on line with opening {
preg_match('/(?:^{(?!\r?\n?\s*\*\/)|{\s*$(?!\r?\n?\s*\*\/)).+^\s*}(?!\r?\n?\s*\*\/)/ms', $c, $match);

// 2) with class check -> this should always work
preg_match('/^[\s\w]+?(?:{(?!\r?\n?\s*\*\/)|{\s*$(?!\r?\n?\s*\*\/)).+^\s*}(?!\r?\n?\s*\*\/)/ms', $c, $match);

// 3) with class check and capturing the second part (non-class-definition) separately -> this should always work
preg_match('/^[\s\w]+?((?:{(?!\r?\n?\s*\*\/)|{\s*$(?!\r?\n?\s*\*\/)).+^\s*}(?!\r?\n?\s*\*\/))/ms', $c, $match);

I recommend using 3).我建议使用 3)。

/**
 * The LightTaskSchedulerService class. :{
 */
class LightTaskSchedulerService implements TaskSchedulerService {
{
    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");
        if ($foo) {
            doBar($foo);
        }
        /* multiline */
        // simple one line comment
        // simple one line comment { }
        # another comment
        # another comment}} {
        # another comment{/*}*/
//}
#}
/*}*/
/*{*/
/*
}*/
/*
}
*/
    }
}


// this can happen in comments:}, why
// more stuff
/* multiline hello} hello{
}*/
# singleline{
#}
//}
/*}*/
/**
}*/

Output: Output:

Array
(
    [0] => {
{
    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");
        if ($foo) {
            doBar($foo);
        }
        /* multiline */
        // simple one line comment
        // simple one line comment { }
        # another comment
        # another comment}} {
        # another comment{/*}*/
//}
#}
/*}*/
/*{*/
/*
}*/
/*
}
*/
    }
}
)

Your code does not work, because it has errors:您的代码不起作用,因为它有错误:

  1. Unknown modifier g (for preg_match ) => use preg_match_all instead未知修饰符g (for preg_match ) => 使用preg_match_all代替
  2. $c in your code does not work, since it is not in the php scope write: <?php $c = <<<'EEE'... instead您的代码中的$c不起作用,因为它不在 php scope 中写入: <?php $c = <<<'EEE'...
  3. The look behind in your case did not work, since you can't use +*?你的情况后面的样子没有用,因为你不能使用+*? modifiers.修饰符。

References:参考:

On php.net 'g' is not listed as an option.php.net上,'g' 未列为选项。
Modifier 'g': preg_match_all修饰符“g”: preg_match_all

I don't think that you even need preg_match_all a simple preg_match should work, since you only need this one match anyway.我认为你甚至不需要preg_match_all一个简单的preg_match应该可以工作,因为无论如何你只需要这个匹配。

This should work (tested with PHP 7.0.1).这应该可以工作(使用 PHP 7.0.1 测试)。 It does for me:它对我有用:

preg_match('/^class\s+\w+\s*({.+(?<! )})/ms', $c, $match);
// or:
preg_match('/^class[^{]+({.+(?<! )})/ms', $c, $match);
// or even:
preg_match('^{.+\r?\n}(?<! )/ms', $c, $match);

print_r($match);

The negative look behind in my regex checks for leading whitespace that is followed by } in this case - the closing bracket needs to be at the very left corner in this case.在这种情况下,我的正则表达式中的负面外观检查后跟}的前导空格 - 在这种情况下,右括号需要位于最左角。 This will work unless you want it to be in a different way.除非您希望它以不同的方式出现,否则这将起作用。 You need a delimiter anyway.无论如何,您都需要一个分隔符。 And also you don't want that a closing curly bracket of an if-statement inside your run() method ends the search.而且您也不希望 run() 方法中 if 语句的右大括号结束搜索。

print_r output $match for the first preg_match statement above: print_r output $match上面第一个preg_match语句:

Array
(
    [0] => class LightTaskSchedulerService
{

    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");
        if ($foo) {
            doBar($foo);
        }
    }
}
    [1] => {

    /**
     *
     * This method IS the task manager.
     * See the @page(Light_TaskScheduler conception notes) for more details.
     *
     */
    public function run()
    {
        $executionMode = $this->options['executionMode'] ?? "lastOnly";
        $this->logDebug("Executing run method with execution mode \"$executionMode\".");
        if ($foo) {
            doBar($foo);
        }
    }
}
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM