[英]Beautiful Soup not returning anything
Hi Im trying to use Beautiful Soup to webscrape off of a website and print facts.嗨,我正在尝试使用 Beautiful Soup 从网站上抓取网页并打印事实。 This is the website https://fungenerators.com/random/facts/animal/weasel .
这是网站https://fungenerators.com/random/facts/animal/weasel 。 Im trying to webscrape the fact, although it always ends up printing [] Any idea whats wrong with my code??
我试图通过网络抓取事实,尽管它总是最终打印 [] 知道我的代码有什么问题吗?
from urllib.request import urlopen
from bs4 import BeautifulSoup
scrape = "https://fungenerators.com/random/facts/animal/weasel"
request_page = urlopen(scrape)
page_html = request_page.read()
request_page.close()
html_soup = BeautifulSoup(page_html, 'html.parser')
fact = html_soup.find_all('div', class_="wow fadeInUp animated animated")
print(fact)
There are two problems with your code:您的代码有两个问题:
The element you want is under an h2
tag, not a div
.您想要的元素位于
h2
标签下,而不是div
。
Since some of the data is loaded dynamically, the class-name changes, and removes the second appearance of the word "animated".由于某些数据是动态加载的,因此类名发生了变化,并删除了“动画”一词的第二次出现。 Instead of the class-name being
wow fadeInUp animated animated
it is wow fadeInUp animated
.而不是类名是
wow fadeInUp animated animated
它是wow fadeInUp animated
。
See the following example:请参见以下示例:
from urllib.request import urlopen
from bs4 import BeautifulSoup
scrape = "https://fungenerators.com/random/facts/animal/weasel"
request_page = urlopen(scrape)
page_html = request_page.read()
request_page.close()
html_soup = BeautifulSoup(page_html, 'html.parser')
fact = html_soup.find_all('h2', class_="wow fadeInUp animated")
print(fact)
(Since there's only one tag, you might want to consider using find()
instead of find_all()
, in order to get the text using the .text
method): (由于只有一个标签,您可能需要考虑使用
find()
而不是find_all()
,以便使用.text
方法获取文本):
...
fact = html_soup.find('h2', class_="wow fadeInUp animated").text
Use my code instead!!!改用我的代码!!!
import requests
from bs4 import BeautifulSoup
response = requests.get('https://fungenerators.com/random/facts/animal/weasel')
soup = BeautifulSoup(response.content, 'html.parser')
result = soup.select('div.wow.fadeInUp.animated.animated')
print(result[0].text)
And result would be:结果将是:
Random Weasel Fact
Or if you don't want to use css selectors then you could do something like this:或者,如果您不想使用 css 选择器,那么您可以执行以下操作:
import requests
from bs4 import BeautifulSoup
response = requests.get('https://fungenerators.com/random/facts/animal/weasel')
soup = BeautifulSoup(response.content, 'html.parser')
result = soup.find_all('h2', class_="wow fadeInUp animated")
print(result[0].text)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.