简体   繁体   中英

How can I match the price in this string by python Regex?

How can I match the price in this string?

    <div id="price_amount" itemprop="price" class="h1 text-special">
      $58
    </div>

I want the $58 in this string, how to do that? This is what I am tring, but doesn't work:

    regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">(.+?)</div>'
    price = re.findall(regex, string)

You really should not use regex for this particular problem. Look into an XML/HTML parsing library for Python instead.

Having said that, your regex is just missing a match for the newlines, so you need to add \\s* after the opening tag and before the closing tag.

import re

string="""
    <div id="price_amount" itemprop="price" class="h1 text-special">
      $58
    </div>
    """
regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">\s*(.+?)\s*</div>'
price = re.findall(regex, string)
print price

Try to capture only the price which was inbetween <div></div> tags,

import re
str=('<div id="price_amount" itemprop="price" class="h1 text-special">'
     '$58'
     '</div>')
regex = r'<div id="price_amount" itemprop="price" class="h1 text-special">([^<]*?)</div>'
price= re.search(regex, str)
price.group(1) # => '$58'

([^<]*?) this code will catch any character not of < zero or more times and stores the captured character into a group( group1 ). ? followed by * means a non-greedy match.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM