Input: asjkd http://www.as.com/as/g/ff askl
Expected output: http://www.as.com/as/g/ff
When I try below I am getting expected output
pattern=re.compile(r'http[\w./:]+')
print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))
Why isn't the +
quantifier greedy here? I was expecting it to be greedy. Here actually not being greedy is helping to find the right answer.
It is greedy. It stops matching when it hits the space because [\w./:]
doesn't match a space. A space isn't a word character (alphanumeric or underscore), dot, slash, or colon.
Change +
to +?
and you can see what happens when it's non-greedy.
Greedy
>>> pattern=re.compile(r'http[\w./:]+')
>>> print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))
<re.Match object; span=(6, 31), match='http://www.as.com/as/g/ff'>
Non-greedy
>>> pattern=re.compile(r'http[\w./:]+?')
>>> print(pattern.search("asjkd http://www.as.com/as/g/ff askl"))
<re.Match object; span=(6, 11), match='http:'>
It matches a single character :
!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.