I'm trying to get everything from a webpage up until the second occurrence of a word matchdate
.
(.*?matchdate){2}
is what I'm trying but that's not doing that trick. The page has 14+ matches of "matchdate" and I only want to get everything up to the second one, and then nothing else.
https://regex101.com/r/Cjyo0f/1 <--- my saved regex.
What am I missing here?
Thanks.
There are a couple ways you can do this:
g
flag Without the global flag, regex will only grab the first instance it encounters.
https://regex101.com/r/Cjyo0f/2
^
to the front of the regex A caret will force the regex to match from the beginning of the string, ruling out all other possibilities.
https://regex101.com/r/Cjyo0f/3
.split()
and .join()
If regular python is available, I would recommend:
string = "I like to matchdate, I want to each matchdate for breakfest"
print "matchdate".join(string.split("matchdate")[:2])
You almost had it! (.*?matchdate){2}
was actually correct. It just needs a re.DOTALL
flag so that the dot matches newlines as well as other characters.
Here is a working test:
>>> import re
>>> s = '''First line
Second line
Third with matchdate and more
Fourth line
Fifth with matchdate and other
stuff you're
not interested in
like another matchdate
or a matchdate redux.
'''
>>> print(re.search('(.*?matchdate){2}', s, re.DOTALL).group())
First line
Second line
Third with matchdate and more
Fourth line
Fifth with matchdate
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.