简体   繁体   中英

question mark in regular expression

I saw this regular expression performed on an url:

$url = 'http://www.domain.com/';
preg_match('/(http)(.*?)\n/', $url, $matches);

I am not sure what the use of the question mark "?" is in this regex expression. According to regex manuals, the "?" is a meta character that is equivalent to {0,1}. Then, what is the point of having "?" after an * since * already represents {0,}

Can someone please enlighten me. Thanks.

It has a different meaning when it follows another quantifier.

In this case it changes the matching behaviour of the preceding quantifier. The default behaviour is greedy and the the ? changes it to "ungreedy".

  • "Greedy" means match as much as possible

  • "Ungreedy" means match as less as possible

See the article on regular-expression.info

For example:

a.+b will match "aabxb" in aabxb

a.+?b will match only "aab" in aabxb

See the example here on Regexr

You may be interested in my blog post about this topic: You do know Quantifiers. Really?

About your regex

preg_match('/(http)(.*?)\n/', $url, $matches);

I don't think it makes a difference here. The . matches anything but newline characters by default (you can change this by adding a s after the closing regex delimiter), so if the question mark is there or not, it will match only till the first \\n .

If you change the behaviour by using preg_match('/(http)(.*?)\\n/s', $url, $matches); , it will make a difference. .*\\n would match till the last \\n and .*?\\n will stop at the first \\n .

In this case, the question mark means a "stingy" match. It will stop matching as soon as the first \\n is encountered, while otherwise, it would gobble up intervening \\n s until the last.

More about greedy and stingy matching at http://www.perl.com/doc/FMTEYEWTK/regexps.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM