[英]Python beautifulsoup print by line #
Okay so I'm currently using python beautifulsoup to output a specific line from a html file, since the html contains multiple of the same div class, it'll output every div containing the same class, example of this 好的,所以我目前正在使用python beautifulsoup从html文件输出特定行,因为html包含同一个div类的多个,它将输出每个包含同一个类的div,例如
CONTENT: 内容:
<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>
OUTPUT: 输出:
<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>
Now I only want #2 of div class border, 现在我只想要div类边框的#2,
<div class=border>example</a>
now if i view source within chrome, it'll show content in number lines, so line 1 will contain 现在,如果我在chrome中查看源代码,它将在数字行中显示内容,因此第1行将包含
<div class=border>aaaa</a>
& line 2 will contain &第2行将包含
<div class=border>example</a>
is it possible to output via numbered line using beautiful soup? 可以用美丽的汤通过编号线输出吗?
find_all returns a list, so you can index it with [1]
to get the second element. find_all返回一个列表,因此您可以使用
[1]
进行索引以获得第二个元素。
from bs4 import BeautifulSoup
html_doc = """<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>"""
soup = BeautifulSoup(html_doc, 'html.parser')
soup.find_all(class_="border")[1]
returns 退货
<div class="border">example</div>
If you have the list with say 200 elements generated by soup.find_all... If the list is called div_list, you could just do an index loop (you want index 1,4,7 etc...) 如果您有一个包含200个元素的列表,则由soup.find_all ...如果该列表名为div_list,则可以执行索引循环(您需要索引1,4,7等)。
count = 1
while True:
try:
print(div_list[count])
count+=3
except:
# happens because of index error
break
Or even shorter: 甚至更短:
count = 1
while count<= len(div_list):
print(div_list[count])
count+=3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.