I have the following regular expression:
>>> re.findall('http://www.rottentomatoes.com/.+', html)
['http://www.rottentomatoes.com/m/1129132-torque" class="see-all">Read More About This Movie On Rotten Tomatoes</a>']
How would I get this to match up until the "
. I am trying to get the return to be:
http://www.rottentomatoes.com/m/1129132-torque
Use a non-greedy quantifier ?
to stop at the first "
:
>>> html = 'http://www.rottentomatoes.com/m/1129132-torque" class="see-all">Read More About This Movie On Rotten Tomatoes</a>'
>>> re.search('(http://www\.rottentomatoes\.com/.+?)"', html).group(1)
'http://www.rottentomatoes.com/m/1129132-torque'
Just add the character(") where you want to stop. Also add ?
, so that it stops at the first match.
>>> html='http://www.rottentomatoes.com/m/1129132-torque" class="see-all">Read More About This Movie On Rotten Tomatoes</a>'
>>> re.findall('http://www.rottentomatoes.com/.+?\"', html)
['http://www.rottentomatoes.com/m/1129132-torque"']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.