简体   繁体   中英

Get data from HTML page using python

I would like to get the value 100 from the tag below using python and beautiful soup

<span style="font-size:90%"><b>100</b> <cite style="color:#cc0000"><b>-0.10</b> (0.52%)</cite></span>

The code below gives me the following output

100 -0.10 (0.52%)

How can I extract only the value 100?

Code:

from urllib.request import Request, urlopen
import bs4 
import re

url =  'url.com'
req = Request(url, headers = {'User-Agent': 'Mozilla/5.0'})
page = urlopen(req).read()
soup = bs4.BeautifulSoup(page, 'html.parser')
data = soup.find('span',style=re.compile('font-size:90%'))
value = data.text

You can get the first element of soup.contents :

from bs4 import BeautifulSoup as soup
d = soup(page, 'html.parser').find('span', {'style':'font-size:90%'}).contents[0].text

Output:

'100'

Just Find the <b> tag it will give you 100.

data = soup.find('span',style=re.compile('font-size:90%'))
value = data.find('b').text

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM