简体   繁体   中英

PHP - REGEX - Match Length

I have this regex :

^((?:(?:\s*[a-zA-Z0-9]+)*)?)\s*function\s+([_a-zA-Z0-9]+)\s+\(\s*(.*)\s*\)\s*

to match this string :

public function private ($var,Type $typed, $optional = 'option');

It works, but when it comes to match this one :

public function privateX ($var,Type $typed, $optional = 'option');

It fails.

I noticed that when the length of the function's name exceeds 6 chars, it does not match anymore.

Here is the full code :

$strA = 'public function 6Chars ($var,Type $typed, $optional = "option");';
$strB = 'public function MoreThan7 ($var,Type $typed, $optional = "option");';

preg_match('!^((?:(?:\s*[a-zA-Z0-9]+)*)?)\s*function\s+([_a-zA-Z0-9]+)\s+\(\s*(.*)\s*\)\s*!',$strA,$mA);
preg_match('!^((?:(?:\s*[a-zA-Z0-9]+)*)?)\s*function\s+([_a-zA-Z0-9]+)\s+\(\s*(.*)\s*\)\s*!',$strB,$mB);

print_r($mA);
print_r($mB);

My Question is pretty simple : why the second string does not match ?

I can't reproduce this in RegexBuddy; both declarations do match. However, the steps needed by the regex engine to arrive at a match double with each character. A function name of 6 characters takes about 100.000 steps of the regex engine, 7 characters 200.000 steps, 8 characters 400.000 steps etc.

Perhaps the regex engine gives up after a certain number of steps?

A possessive quantifier ( ++ ) cuts down drastically on the number of steps needed by reducing the possible permutations the regex engine has to go through - 50 steps regardless of the length of the function name.

!^((?:(?:\s*[a-zA-Z0-9]++)*)?)\s*function\s+([_a-zA-Z0-9]+)\s+\(\s*(.*)\s*\)\s*!

The reason for the catastrophic backtracking you're seeing in your regex is this:

(?:(?:\s*[a-zA-Z0-9]+)*)

You are nesting quantifiers, and you've made the spaces optional. Therefore ABC can be matched as ABC , A / BC , AB / C or A / B / C . The number of permutation rises exponentially with every character. You further complicate matters by making the entire group optional (the ? surrounding the whole thing).

you just need to enable the /multiline /m flag and then it will match both of the lines i tested it confirm below. cheers

gskinner

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM