简体   繁体   English

如何在 HTML 中找到标签的数量

[英]How to find count of tags in HTML

I have an html that looks like this:我有一个看起来像这样的 html:

 <h3>First</h3>
<ol>
    <li>
        First list

        <ol>
            <li>First Sub list</li>
        </ol>
    </li>
</ol>
<h3>Second</h3>
<ol>
    <li>
        second list
        <ol>
            <li>First Sub list</li>
            <li>second sub list</li>
        </ol>
    </li>
    <li>
        second in second list
        <ol>
            <li>First Sub list</li>
            <li>second sub list</li>
        </ol>
    </li>
</ol>

I need to be able to get count of li tags that appear within ol tag after each h3.我需要能够获得在每个 h3 之后出现在 ol 标签中的 li 标签的数量。 So the result will 1 2所以结果是 1 2

I am using this statement to calculate count我正在使用此语句来计算计数

print(len(h3_soup1.find_next("ol").find_all("li")))

But this gives me count of all the li tags within ol tag.但这让我计算了 ol 标签中的所有 li 标签。 For instance for the first one it says 2 and for 2nd it says 6.例如,第一个它说 2,第二个它说 6。

Edit:编辑:

要计算的标签

For the first ol output should be 1 For second ol it should be 2 So print for each ol对于第一个 ol 输出应该是 1 对于第二个 ol 应该是 2 所以打印每个 ol

Edit:编辑:

The final goal is to find length of these li tags.最终目标是找到这些 li 标签的长度。 And if this length is greater than a certain number then remove the last tags.如果此长度大于某个数字,则删除最后一个标签。 For instance if length shouldn't be greater than 1 then the second list will become:例如,如果长度不应大于 1,则第二个列表将变为:

<h3>First</h3>
<ol>
    <li>
        First list

        <ol>
            <li>First Sub list</li>
        </ol>
    </li>
</ol>
<h3>Second</h3>
<ol>
    <li>
        second list
        <ol>
            <li>First Sub list</li>
            <li>second sub list</li>
        </ol>
    </li>

</ol>

Filter ol.children .过滤ol.children

from bs4 import BeautifulSoup

data = '''\
<h3>First</h3>
<ol>
    <li>
        First list

        <ol>
            <li>First Sub list</li>
        </ol>
    </li>
</ol>
<h3>Second</h3>
<ol>
    <li>
        second list
        <ol>
            <li>First Sub list</li>
            <li>second sub list</li>
        </ol>
    </li>
    <li>
        second in second list
        <ol>
            <li>First Sub list</li>
            <li>second sub list</li>
        </ol>
    </li>
</ol>
'''

soup = BeautifulSoup(data, 'html.parser')

for ol in soup.select('h3 + ol'):
    print(
        len([e for e in ol.children
             if type(e) == type(ol) and e.name == 'li']))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM