简体   繁体   中英

Why does *.* in regex return undefined

At least, in Javascript, tested on both Chrome and Node.js:

new RegExp(/foo(optional)*boo/).exec('foooptionalboo')

Will match the optional in parentheses:

[ 'foooptionalboo',
'optional',
index: 0,
input: 'foooptionalboo' ]

But if you want there to be something in between the optional :

new RegExp(/foo.*(optional)*.*boo/).exec('foooptionalboo')

Then the optional is not found:

[ 'foooptionalboo',
'optional',
index: 0,
input: 'foooptionalboo' ]

Why is this?

The .* matches optional before (optional)* has a chance to.

Make it non-greedy (with a ? ) so it won't match if the thing following it will.

/foo.*?(optional)*.*boo/.exec("foooptionalboo")

The problem with Quentin's answer is that .*? followed with an optional greedy subpattern (optional)? and a greedy dot matching pattern .* works in such a way that the .*? only matches the empty string, and .* takes up the whole rest of the string.

Why does it happen? Because lazy subpatterns that can match an empty string (and it will always match here since it can match an empty string) work so: once the lazy subpattern matches, other subpatterns to the right are tried, and if a match is found, the lazy subpattern is not re-tried. 在此输入图像描述

To really grab an optional part, either use a specific pattern where no .* appears after the optional part, or (to make it more generic) use a tempered greedy token :

foo(?:(?!optional).)*(optional)*.*boo
   ^^^^^^^^^^^^^^^^^^

See the regex demo

The (?:(?!optional).)* is the tempered greedy token that matches any text up to the first optional substring.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM