How can I match the price in this string?
<div id="price_amount" itemprop="price" class="h1 text-special">
$58
</div>
I want the $58 in this string, how to do that? This is what I am tring, but doesn't work:
regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">(.+?)</div>'
price = re.findall(regex, string)
You really should not use regex
for this particular problem. Look into an XML/HTML parsing library for Python instead.
Having said that, your regex is just missing a match for the newlines, so you need to add \\s*
after the opening tag and before the closing tag.
import re
string="""
<div id="price_amount" itemprop="price" class="h1 text-special">
$58
</div>
"""
regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">\s*(.+?)\s*</div>'
price = re.findall(regex, string)
print price
Try to capture only the price which was inbetween <div></div>
tags,
import re
str=('<div id="price_amount" itemprop="price" class="h1 text-special">'
'$58'
'</div>')
regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">([^<]*?)</div>'
price= re.search(regex, str)
price.group(1) # => '$58'
([^<]*?)
this code will catch any character not of <
zero or more times and stores the captured character into a group( group1
). ?
followed by *
means a non-greedy match.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.