[英]AttributeError: 'NoneType' object has no attribute 'get_text' in beautifulsoop web-scraping
I am doing a project using beautifulsoop(web scraping) in python.我正在做一个在 python 中使用 beautifulsoop(web scraping) 的项目。 Earlier the program was running fine and perfectly.
早些时候,该程序运行良好且完美。 But, now it gives error as shown below.
但是,现在它给出了如下所示的错误。 It might be that the html structure of the website would be changed.
可能是网站的 html 结构会发生变化。 But still I am unable to figure out the error and solve it.
但我仍然无法找出错误并解决它。 Please help!!!
请帮忙!!!
The website is - [https://covidindia.org/][1]该网站是 - [https://covidindia.org/][1]
Please help me to solve the error.请帮我解决错误。
Error-错误-
Traceback (most recent call last):
File "t1.py", line 112, in <module>
mainLabel = tk.Label(root, text=get_corona_detail_of_india(), font=f, bg='light blue',fg='red')
File "t1.py", line 23, in get_corona_detail_of_india
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
AttributeError: 'NoneType' object has no attribute 'get_text
My code-我的代码-
URL = 'https://covidindia.org/'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup)
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
tc=(total_cases.strip())
Also when I extract soup the o/p is-同样,当我提取汤时,o/p 是-
<html><head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr/><center>nginx</center>
Is my access permanently forbidden??我的访问是永久禁止的吗??
Add a user-agent
header to your request.向您的请求添加
user-agent
标头。 When u don't add a user-agent
, then website detects you as a bot and hence doesn't give you access to the website's contents.当您不添加
user-agent
,网站会将您检测为机器人,因此不会让您访问网站的内容。 Here is the full code:这是完整的代码:
from bs4 import BeautifulSoup
import requests
headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0'}
URL = 'https://covidindia.org/'
page = requests.get(URL,headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup)
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
tc=(total_cases.strip())
Output:输出:
>>> tc
'Total Cases - 83,14,673 (+46,171)'
当站点需要一个您没有放入请求中的对象时,会发生此问题,检查站点需要什么,它可能是其他用户回答的用户代理或其他一些东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.