[英]Beautiful soup findAll didn't find all of them
I'm using Calibre to make a recipe for a website. 我正在使用Calibre制作网站食谱。
The web source code is: Web源代码为:
<section>
<h1 class="fly-title">Leaders</h1>
<article>
<h2><a href="/node/21537908" class="package-link">Democracy and its enemies</a></h2>
<a href="/node/21537908"><img src="http://media.economist.com/sites/default/files/imagecache/news_package_primary_landscape/20120123_LDC001_0.gif" alt="" title="" class="imagecache imagecache-news_package_primary_landscape" width="412" height="232" /></a>
<p>
In the coming year the people who run the world will change—and so could the ideas, predicts John Micklethwait <a href="/node/21537908/comments#comments" title="Comments" class="comment-icon"><span>(0)</span></a> </p>
</article>
<ul class="package-item"><li class="first"><div class="">
<a href="/node/21537909" class="package-link">The year of self-induced stagnation</a> <a href="/node/21537909/comments#comments" title="Comments" class="comment-icon"><span>(7)</span></a></div>
</li>
<li class="even"><div class="">
<a href="/node/21537914" class="package-link">How to run the euro?</a> <a href="/node/21537914/comments#comments" title="Comments" class="comment-icon"><span>(2)</span></a></div>
</li>
<li class=""><div class="">
<a href="/node/21537916" class="package-link">Wanted: a fantasy American president</a> <a href="/node/21537916/comments#comments" title="Comments" class="comment-icon"><span>(0)</span></a></div>
</li>
<li class="even"><div class="">
<a href="/node/21537917" class="package-link">Poking goes public</a> <a href="/node/21537917/comments#comments" title="Comments" class="comment-icon"><span>(7)</span></a></div>
</li>
<li class=""><div class="">
<a href="/node/21537918" class="package-link">Varied company</a> <a href="/node/21537918/comments#comments" title="Comments" class="comment-icon"><span>(0)</span></a></div>
</li>
<li class="even"><div class="">
<a href="/node/21537919" class="package-link">All eyes on London</a> <a href="/node/21537919/comments#comments" title="Comments" class="comment-icon"><span>(0)</span></a></div>
</li>
<li class="last"><div class="">
<a href="/node/21537921" class="package-link">And now for some non-events</a> <a href="/node/21537921/comments#comments" title="Comments" class="comment-icon"><span>(2)</span></a></div>
</li>
</ul>
</section>
I want to find all <a href="/node/********" class="package-link">
我想找到所有
<a href="/node/********" class="package-link">
So I used beautiful soup 所以我用了漂亮的汤
for section in soup.findAll('section'):
...
for post in section.findAll('a', attrs={'class':['package-link']})
But only the first one was found (that is the one in <h2><a href="/node/21537908" class="package-link">Democracy and its enemies</a></h2>
). 但是只有第一个被发现(即
<h2><a href="/node/21537908" class="package-link">Democracy and its enemies</a></h2>
)。
How can I find them all? 如何找到所有这些?
Works for me: 为我工作:
soup = BeautifulSoup.BeautifulSoup(xml)
for section in soup.findAll("section"):
for post in section.findAll('a', attrs={'class':['package-link']}):
print post
results in: 结果是:
<a href="/node/21537908" class="package-link">Democracy and its enemies</a>
<a href="/node/21537909" class="package-link">The year of self-induced stagnation</a>
<a href="/node/21537914" class="package-link">How to run the euro?</a>
<a href="/node/21537916" class="package-link">Wanted: a fantasy American president</a>
<a href="/node/21537917" class="package-link">Poking goes public</a>
<a href="/node/21537918" class="package-link">Varied company</a>
<a href="/node/21537919" class="package-link">All eyes on London</a>
<a href="/node/21537921" class="package-link">And now for some non-events</a>
Edit 编辑
Versions I use: 我使用的版本:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.