简体   繁体   中英

How not capture a string with regex

i have this string

<div class"ewSvNa"><a class="ugP" href="link">Description</a><span data-testid=""><small>$</small><span>0,00</span></div>

and this regex /ewS.*?ugP\".*?f=\"(.*?)\">(.*?)<.*?<s.*?n>(.*?)</g . The result is:

Group 1 = 'link'
Group 2 = 'Description'
Group 3 = '0,00'

My question is: It`s possible have the result of Group 3 like '$0,00'?

Thank u guys =]]]]]

It's recommend to not use regex to parse HTML - instead use a proper parser such as Beautiful Soup .

Then your code becomes:

from bs4 import BeautifulSoup

text = '<div class"ewSvNa"><a class="ugP" href="link">Description</a><span data-testid=""><small>$</small><span>0,00</span></div>'
soup = BeautifulSoup(text)
amount = soup.select_one('span[data-testid]').get_text()
# '$0,00'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM