提取數據之間 </br> 使用beautifulsoup的標簽

Question

我有這個html數據，我需要解析以從中提取數據。但是它有這么多標簽，並且數據對我來說也很難導航。從HTML數據下面，我需要創建一個python字典列表，如下所示：

[{“ School”：“ Childs play”}，{“ Place”：“ newyork”}，{“ Level”：“ four”}，{“ Country”：“ USA”}，{“ Level of Course”：“簡單”}]

<div class="quick">
 <strong>School</strong><br /> Childs play <br /><br />
 <strong>Place</strong><br />
 <a href="Search.aspx?Menu=new&amp;Me=">newyork</a><br /><br />
 <strong>Level</strong><br />four<br /><br />
 <strong>Country</strong><br />USA<br /><br />
 <strong>Level Of Course</strong><br />Easy<br /><br />
</div>

我嘗試使用beautifulsoup，但沒有成功。請幫助

Answer 1

不幸的是，HTML並不是用於解析的理想構造，但是可以將數據提取到有意義的Python字典中。

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(htmlString)

raw_data = soup.find(**{"class": "quick"}).contents
data = [x for x in raw_data if not hasattr(x, "name") or not x.name == "br"]

if not hasattr(x, "name") or not x.name == "br"使用if not hasattr(x, "name") or not x.name == "br"首先檢查以確保該項目是NavigableString的實例，然后檢查該元素是否不是<BR>標記。

然后， data將具有[<KEY>, <VALUE>, <KEY>, <VALUE>]格式，從中提取數據應該非常簡單。

提取數據之間 </br> 使用beautifulsoup的標簽

問題描述

1 個解決方案

解決方案1
1 已采納 2012-04-18 07:59:38

提取數據之間 </br> 使用beautifulsoup的標簽

問題描述

1 個解決方案

解決方案1 1 已采納 2012-04-18 07:59:38

解決方案1
1 已采納 2012-04-18 07:59:38