Python beautifulsoup按行号打印

Question

Okay so I'm currently using python beautifulsoup to output a specific line from a html file, since the html contains multiple of the same div class, it'll output every div containing the same class, example of this 好的，所以我目前正在使用python beautifulsoup从html文件输出特定行，因为html包含同一个div类的多个，它将输出每个包含同一个类的div，例如

CONTENT: 内容：

<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>

OUTPUT: 输出：

<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>

Now I only want #2 of div class border, 现在我只想要div类边框的＃2，

<div class=border>example</a>

now if i view source within chrome, it'll show content in number lines, so line 1 will contain 现在，如果我在chrome中查看源代码，它将在数字行中显示内容，因此第1行将包含

<div class=border>aaaa</a>

& line 2 will contain ＆第2行将包含

<div class=border>example</a>

is it possible to output via numbered line using beautiful soup? 可以用美丽的汤通过编号线输出吗？

Answer 1

find_all returns a list, so you can index it with [1] to get the second element. find_all返回一个列表，因此您可以使用[1]进行索引以获得第二个元素。

from bs4 import BeautifulSoup

html_doc = """<div class=border>aaaa</a>
<div class=border>example</a>
<div class=border>runrunrun</a>"""

soup = BeautifulSoup(html_doc, 'html.parser')

soup.find_all(class_="border")[1]

returns 退货

<div class="border">example</div>

Answer 2

If you have the list with say 200 elements generated by soup.find_all... If the list is called div_list, you could just do an index loop (you want index 1,4,7 etc...) 如果您有一个包含200个元素的列表，则由soup.find_all ...如果该列表名为div_list，则可以执行索引循环（您需要索引1,4,7等）。

count = 1
while True:
    try:
        print(div_list[count])
        count+=3
    except:
    # happens because of index error
        break

Or even shorter: 甚至更短：

count = 1
while count<= len(div_list):
    print(div_list[count])
    count+=3

Python beautifulsoup按行号打印

问题描述

2 个解决方案

解决方案1
0 2017-08-11 06:30:40

解决方案2
0 2017-08-11 12:07:34

Python beautifulsoup按行号打印

问题描述

2 个解决方案

解决方案1 0 2017-08-11 06:30:40

解决方案2 0 2017-08-11 12:07:34

解决方案1
0 2017-08-11 06:30:40

解决方案2
0 2017-08-11 12:07:34