[英]Beautiful Soup scrape text from second child in same div class
I have two <tr>
tags within the same div class.我在同一个 div 类中有两个
<tr>
标签。 The first tr tag prints the text just fine.第一个 tr 标签可以很好地打印文本。 I am trying to access the second tr tag within the container that I have but I cant seem to get it to work.
我正在尝试访问我拥有的容器内的第二个 tr 标签,但我似乎无法让它工作。 Also please note, not all containers have a second
<tr>
tag so I need an if
statement to check if it exists first.另请注意,并非所有容器都有第二个
<tr>
标记,因此我需要一个if
语句来首先检查它是否存在。 Then if it does, print the text from it.然后,如果是,则从中打印文本。 Thanks!
谢谢!
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div",{"class":"right"})
for container in containers:
print(container.span.text)
print(container.tr.text)
if container.nextSiblings('tr')[1]:
print(container.nextSiblings('tr')[1].text)
You can locate all the tr
elements within a container and check how many of them you have:您可以找到容器中的所有
tr
元素并检查您拥有多少个元素:
for container in containers:
trs = container("tr") # same as container.find_all("tr")
if len(trs) > 1:
print(trs[1].get_text())
You can also directly locate the second tr
within every container in a single CSS selector :您还可以在单个CSS 选择器中直接定位每个容器中的第二个
tr
:
for tr in soup.select(".right > tr:nth-of-type(2)"):
print(tr.get_text())
Demo:演示:
from bs4 import BeautifulSoup
data = """
<body>
<div class="right">
<tr>container 1 row 1</tr>
<tr>container 1 row 2</tr>
</div>
<div class="right">
<tr>container 2 row 1</tr>
</div>
<div class="right">
<tr>container 3 row 1</tr>
<tr>container 3 row 2</tr>
<tr>container 3 row 3</tr>
</div>
</body>
"""
soup = BeautifulSoup(data, "html.parser")
for tr in soup.select(".right > tr:nth-of-type(2)"):
print(tr.get_text())
would print:会打印:
container 1 row 2
container 3 row 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.