Using BeautifulSoup to extract parts of string

Question

I'm using Beautiful Soup 4 to parse a html document and extract data.

I'd like to get the time values from this tag:

<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>

IE: 20 min, 10 min

Answer 1

Does this help?

from BeautifulSoup import BeautifulSoup
from BeautifulSoup import Tag

soup = BeautifulSoup("<span style=\"font-size:9.0pt;font-family:Arial;color:#666666\"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>")
span = soup.find('span')
for e in span.contents:
 if type(e) is Tag:
   print "found a tag:", e.name
 else:
   print "found text:", e

Output:

found text:  20 min
found a tag: b
found text:  10 min
found a tag: b
found text:  0 min

Answer 2

This is how it should be done:

from bs4 import BeautifulSoup

ss = """<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time:     </b> 10 min <b>Other time: </b> 0 min</span>"""
soup = BeautifulSoup(ss)
timetext = soup.span.text
start_time = timetext.split("Start time:")[1].split("min")[0].strip()

I've left the extraction of other_time as an exercise for you!

Using BeautifulSoup to extract parts of string

Question

2 answers

solution1
2 ACCPTED 2014-12-12 11:04:07

solution2
0 2014-12-12 10:59:32

Using BeautifulSoup to extract parts of string

Question

2 answers

solution1 2 ACCPTED 2014-12-12 11:04:07

solution2 0 2014-12-12 10:59:32

solution1
2 ACCPTED 2014-12-12 11:04:07

solution2
0 2014-12-12 10:59:32