I'm creating a regex as below:
import re
asd = re.compile(r"(blah){2}")
mo = asd.search("blahblahblahblahblahblah ll2l 21HeHeHeHeHeHe lllo")
mo1 = asd.findall("blahblahblahblahblahblah")
print(mo.group())
print("findall output: ", mo1)
This returns output blahblah findall output: ['blah', 'blah', 'blah']
-Why findall output matches 'blah' three times, when its specified {2} times only in the pattern?
If I change to {4}, then findall matches:
asd = re.compile(r"(blah){4}")
findall output: ['blah']
-How is {m} treated with re.search and re.findall ?
Thanks a lot.
If you want to catch the (blah){2}
(the 2 blah
you have there) you should wrap it:
asd = re.compile(r"((?:blah){2})")
Note that I made sure not to catch the inside
blah
(using?:
)
>>>asd = re.compile(r"((?:blah){2})")
>>>mo = asd.search("blahblahblahblahblahblah ll2l 21HeHeHeHeHeHe lllo")
>>>mo1 = asd.findall("blahblahblahblahblahblah")
>>>print(mo.group())
blahblah
>>>print("findall output: ", mo1)
findall output: ['blahblah', 'blahblah', 'blahblah']
Exactly the same goes with the {4}
you have there. The regex
will find it, but will not catch it. if you want to catch it you should wrap it.
(blah){2}
captures and exhausts the string blahblah
but only returns the last blah
in blahblah
. Since you have three blahblah
s in your string, it will output ['blah', 'blah', 'blah']
(blah){4}
can only match once so it gives you ['blah']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.