[英]How to find count of tags in HTML
I have an html that looks like this:我有一个看起来像这样的 html:
<h3>First</h3>
<ol>
<li>
First list
<ol>
<li>First Sub list</li>
</ol>
</li>
</ol>
<h3>Second</h3>
<ol>
<li>
second list
<ol>
<li>First Sub list</li>
<li>second sub list</li>
</ol>
</li>
<li>
second in second list
<ol>
<li>First Sub list</li>
<li>second sub list</li>
</ol>
</li>
</ol>
I need to be able to get count of li tags that appear within ol tag after each h3.我需要能够获得在每个 h3 之后出现在 ol 标签中的 li 标签的数量。 So the result will 1 2
所以结果是 1 2
I am using this statement to calculate count我正在使用此语句来计算计数
print(len(h3_soup1.find_next("ol").find_all("li")))
But this gives me count of all the li tags within ol tag.但这让我计算了 ol 标签中的所有 li 标签。 For instance for the first one it says 2 and for 2nd it says 6.
例如,第一个它说 2,第二个它说 6。
Edit:编辑:
For the first ol output should be 1 For second ol it should be 2 So print for each ol对于第一个 ol 输出应该是 1 对于第二个 ol 应该是 2 所以打印每个 ol
Edit:编辑:
The final goal is to find length of these li tags.最终目标是找到这些 li 标签的长度。 And if this length is greater than a certain number then remove the last tags.
如果此长度大于某个数字,则删除最后一个标签。 For instance if length shouldn't be greater than 1 then the second list will become:
例如,如果长度不应大于 1,则第二个列表将变为:
<h3>First</h3>
<ol>
<li>
First list
<ol>
<li>First Sub list</li>
</ol>
</li>
</ol>
<h3>Second</h3>
<ol>
<li>
second list
<ol>
<li>First Sub list</li>
<li>second sub list</li>
</ol>
</li>
</ol>
Filter ol.children
.过滤
ol.children
。
from bs4 import BeautifulSoup
data = '''\
<h3>First</h3>
<ol>
<li>
First list
<ol>
<li>First Sub list</li>
</ol>
</li>
</ol>
<h3>Second</h3>
<ol>
<li>
second list
<ol>
<li>First Sub list</li>
<li>second sub list</li>
</ol>
</li>
<li>
second in second list
<ol>
<li>First Sub list</li>
<li>second sub list</li>
</ol>
</li>
</ol>
'''
soup = BeautifulSoup(data, 'html.parser')
for ol in soup.select('h3 + ol'):
print(
len([e for e in ol.children
if type(e) == type(ol) and e.name == 'li']))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.