I'm using Beautiful Soup 4 to parse a html document and extract data.
I'd like to get the time values from this tag:
<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>
IE: 20 min, 10 min
Does this help?
from BeautifulSoup import BeautifulSoup
from BeautifulSoup import Tag
soup = BeautifulSoup("<span style=\"font-size:9.0pt;font-family:Arial;color:#666666\"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>")
span = soup.find('span')
for e in span.contents:
if type(e) is Tag:
print "found a tag:", e.name
else:
print "found text:", e
Output:
found text: 20 min
found a tag: b
found text: 10 min
found a tag: b
found text: 0 min
This is how it should be done:
from bs4 import BeautifulSoup
ss = """<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>"""
soup = BeautifulSoup(ss)
timetext = soup.span.text
start_time = timetext.split("Start time:")[1].split("min")[0].strip()
I've left the extraction of other_time as an exercise for you!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.