![](/img/trans.png)
[英]Get Dynamic Tabular from Website data using Selenium & Beautiful Soup
[英]Using Beautiful Soup to get data from non-class section
我還是個新手,正在學習蟒蛇和美湯。 我一直在想如何從非類的 HTML 中獲取文本。
這是我正在使用的 HTML 片段:
<section class="userbody">
<script type="text/javascript"></script>
<figure class="iw">
<div id="ci">
<img id="iwi" title="image 2" alt="" src="http://images.craigslist.org/00C0C_daJm4U9yU5B_600x450.jpg" style="min-width: inherit; min-height: 450px;"></img>
</div>
<div id="thumbs"></div>
</figure>
<div class="mapAndAttrs">
<div class="mapbox">
<div id="map" class="leaflet-container leaflet-fade-anim" data-longitude="-84.072447" data-latitude="33.908534" tabindex="0">
<div class="leaflet-map-pane" style="transform: translate(0px, 0px);"></div>
<div class="leaflet-control-container">
<div class="leaflet-top leaflet-left"></div>
<div class="leaflet-top leaflet-right"></div>
<div class="leaflet-bottom leaflet-left"></div>
<div class="leaflet-bottom leaflet-right">
<div class="leaflet-control-attribution leaflet-control"></div>
</div>
</div>
</div>
<div class="mapaddress">
Some Address
</div>
</div>
<div class="attributes"></div>
</div>
<section id="postingbody">
some posting info
<br></br>
more posting info
<br></br>
</section>
<section class="cltags"></section>
<div class="postinginfos"></div>
</section>
我已經能夠提取地址信息:
for address in soup.findAll("div", { "class" : "mapaddress" }):
addressText = ''.join(address.findAll(text=True))
似乎 findAll() 不適用於我嘗試過的沒有類的標簽
for post in soup.findall("section", { "id" : "postingbody" }):
postText = ''.join(post.findAll(text=True))
如何獲取 id="postingbody" 部分中的文本?
考慮到s
是 html 字符串,您可以執行以下操作:
from bs4 import BeautifulSoup
soup = BeautifulSoup(s)
print soup.find(attrs={'id' : 'postingbody'})
輸出:
<section id="postingbody">
some posting info
<br/>
more posting info
<br/>
</section>
除了 Games Brainiac 的回答:要獲取文本,只需將 .text 放在其后面。
所以:
print soup.find(attrs={'id' : 'postingbody'}).text
如果您使用的是 BeautifulSoup4,您可以這樣做:
element = soup.find(id="postingbody")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.