I am trying to get a price from a website using BeautifulSoup and so far I have managed to get:
<h2>£<!-- -->199.99</h2>
I just want to receive '£199.99' Is there a way to filter out the letters?
Thanks in advance
You will use get_text
function with strip=True to clean if necessary
from bs4 import BeautifulSoup
html = '<h2>£<!-- -->199.99</h2>'
soup = BeautifulSoup(html,'html5lib')
result = soup.find('h2').get_text(strip=True)
print(result)
#£199.99
Use re?
import re
s = "<h2>£<!-- -->199.99</h2>"
rx_price = re.compile(r'([0-9.]+)')
content = re.sub(r'<.+?>', '', s)
print (f"£{rx_price.findall(content)[0]}")
Output:
£199.99
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.