简体   繁体   English

如何使用 class 在 div 之间查找文本

[英]How to find text between div using class

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url="https://store.steampowered.com/search/?specials=1"
uClient=uReq(my_url)
page_html=uClient.read()
uClient.close()

page_soup=soup(page_html,"html.parser")
game_name_containers=page_soup.findAll('div',{"class":"col search_name ellipsis"})
print(len(game_name_containers))

for game_name_container in game_name_containers:
    game_name=game_name_container.span.text
    print(game_name)

It prints out 50 games but there are obviously more than 50. How do I print all of the games?它打印出 50 个游戏,但显然超过 50 个。如何打印所有游戏?

Your code is correctly printing the names and the count of the games returned in the HTML.您的代码正确打印了 HTML 中返回的游戏的名称和计数。 If you open up the source code of the Steam URL, you'll see that they only initially give you 50 games' worth of data.如果你打开 Steam URL 的源代码,你会发现它们最初只给你 50 个游戏的数据。 You have to scroll down to be given more.您必须向下滚动才能获得更多。

As mentioned in this answer , BeautifulSoup is not aware of the Javascript infinite scrolling that is occurring.this answer中所述,BeautifulSoup 不知道正在发生的 Javascript 无限滚动。 You will either need a different tool or find a way to trigger Steam to return more data to you.您要么需要其他工具,要么找到触发 Steam 向您返回更多数据的方法。

As mentioned in that answer, Selenium is a tool you could try.如该答案中所述, Selenium是您可以尝试的工具。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM