簡體   English   中英

使用BeautifulSoup提取部分字符串

[英]Using BeautifulSoup to extract parts of string

我正在使用Beautiful Soup 4解析html文檔並提取數據。

我想從此標簽獲取時間值:

<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>

IE:20分鍾,10分鍾

這有幫助嗎?

from BeautifulSoup import BeautifulSoup
from BeautifulSoup import Tag

soup = BeautifulSoup("<span style=\"font-size:9.0pt;font-family:Arial;color:#666666\"> 20 min <b>Start time: </b> 10 min <b>Other time: </b> 0 min</span>")
span = soup.find('span')
for e in span.contents:
 if type(e) is Tag:
   print "found a tag:", e.name
 else:
   print "found text:", e

輸出:

found text:  20 min
found a tag: b
found text:  10 min
found a tag: b
found text:  0 min

這是應該做的:

from bs4 import BeautifulSoup

ss = """<span style="font-size:9.0pt;font-family:Arial;color:#666666"> 20 min <b>Start time:     </b> 10 min <b>Other time: </b> 0 min</span>"""
soup = BeautifulSoup(ss)
timetext = soup.span.text
start_time = timetext.split("Start time:")[1].split("min")[0].strip()

我已經摘錄了other_time作為練習!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM