简体   繁体   English

抓取网站时获取多个类的内容

[英]Get the content of multiple classes when scraping a website

The problem that I am facing is simple.我面临的问题很简单。 If I am trying to get some data from a website, there are two classes with the same name .如果我试图从网站获取一些数据,则有两个具有相同名称的类。 But they both contain a table with different Information.但它们都包含一个包含不同信息的表。 The code that I have only outputs me the content of the very first class.我的代码只输出了第一个 class 的内容。 It looks like this:它看起来像这样:

page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find("tr", {"class": "table3"})
print(results.prettify())

How can I get the code to put out either the content of both tables or only the content of the second one?如何获取代码以输出两个表的内容或仅输出第二个表的内容? Thanks for your answers in advance!提前感谢您的回答!

You can use .find_all() and [1] to get second result.您可以使用.find_all()[1]获得第二个结果。 Example:例子:

from bs4 import BeautifulSoup

txt = """
<tr class="table3"> I don't want this </tr>
<tr class="table3"> I want this! </tr>
"""

soup = BeautifulSoup(txt, "html.parser")

results = soup.find_all("tr", class_="table3")
print(results[1])  # <-- get only second one

Prints:印刷:

<tr class="table3"> I want this! </tr>

Can I assign class to a list of classes to get elements belonging to different classes in the same list我可以将 class 分配给类列表以获取属于同一列表中不同类的元素吗

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM