简体   繁体   中英

Python Regular Expression Why Quantifier (+) is not greedy

Input: asjkd http://www.as.com/as/g/ff askl

Expected output: http://www.as.com/as/g/ff

When I try below I am getting expected output

pattern=re.compile(r'http[\w./:]+')
print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))

Why isn't the + quantifier greedy here? I was expecting it to be greedy. Here actually not being greedy is helping to find the right answer.

It is greedy. It stops matching when it hits the space because [\w./:] doesn't match a space. A space isn't a word character (alphanumeric or underscore), dot, slash, or colon.

Change + to +? and you can see what happens when it's non-greedy.

Greedy

>>> pattern=re.compile(r'http[\w./:]+')
>>> print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))
<re.Match object; span=(6, 31), match='http://www.as.com/as/g/ff'>

Non-greedy

>>> pattern=re.compile(r'http[\w./:]+?')
>>> print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))
<re.Match object; span=(6, 11), match='http:'>

It matches a single character : !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM