繁体   English   中英

Beautiful Soup AttributeError: 'NoneType' object has no attribute 'find_all' 即使网页结构相同

[英]Beautiful Soup AttributeError: 'NoneType' object has no attribute 'find_all' even though webpage is structured the same

所以我有一个网络爬虫的工作代码,并想在另一个结构相同的网站上使用它。

我拥有的代码是:

url = "https://efl.network/index/efl/LeaguePassingStats.html"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
tb = soup.find('table', class_='tablesorter')
table_rows = tb.find_all("tr")

但是,如果我将url更改为

https://sim-football.com/indexes/DSFLS22/LeaguePassingStats.html

它给了我错误

table_rows = tb.find_all("tr")
AttributeError: 'NoneType' object has no attribute 'find_all'

然而,这两个网站的结构似乎相同。

此行返回无

tb = soup.find('table', class_='tablesorter')

这意味着,页面上没有带有 class 'tablesorter' 的表格元素。

如果您打印出page.content ,您可以看到这是返回的 HTML: '<head><title>Not Acceptable.</title></head><body><h1>Not Acceptable.</h1><p>An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.</p></body></html>' '<head><title>Not Acceptable.</title></head><body><h1>Not Acceptable.</h1><p>An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.</p></body></html>'

因此,该网站出于某种原因阻止了您的 web 抓取并返回此错误 HTML,其中您要查找的table元素不存在,导致您的错误

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM