如何在BeautifulSoup中跳过相同的标签-Python

Question

我目前正在为Scrapers编写代码，并且越来越成为Python的爱好者，尤其是BeautifulSoup。

仍然...通过html解析时，遇到了一个困难的部分，我只能以一种不太漂亮的方式使用它。

我想抓取HTML代码，尤其是以下代码段：

<div class="title-box">
    <h2>
        <span class="result-desc">
            Search results <strong>1</strong>-<strong>10</strong> out of <strong>10,009</strong> about <strong>paul mccartney</strong><a href="alert/settings" class="title-email-alert-promo x-title-alerts-promo">Create email Alert</a>
        </span>
    </h2>
</div>

所以我要做的是通过使用以下方法识别div：

comment = TopsySoup.find('div', attrs={'class' : 'title-box'})

然后是丑陋的部分。要捕获我想要的数字：10,009我使用：

catcher = comment.strong.next.next.next.next.next.next.next

有人可以告诉我是否有更好的方法吗？

Answer 1

怎么样comment.find_all('strong')[2].text呢？

实际上，可以将其缩写为comment('strong')[2].text ，因为将Tag对象当作函数来调用与对它调用find_all相同。

>>> comment('strong')[2].text
u'10,009'

如何在BeautifulSoup中跳过相同的标签-Python

问题描述

1 个解决方案

解决方案1
3 2013-05-23 13:16:05

如何在BeautifulSoup中跳过相同的标签-Python

问题描述

1 个解决方案

解决方案1 3 2013-05-23 13:16:05

解决方案1
3 2013-05-23 13:16:05