Beautiful Soup AttributeError: 'NoneType' object has no attribute 'find_all' 即使网页结构相同

Question

所以我有一个网络爬虫的工作代码，并想在另一个结构相同的网站上使用它。

我拥有的代码是：

url = "https://efl.network/index/efl/LeaguePassingStats.html"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
tb = soup.find('table', class_='tablesorter')
table_rows = tb.find_all("tr")

但是，如果我将url更改为

https://sim-football.com/indexes/DSFLS22/LeaguePassingStats.html

它给了我错误

table_rows = tb.find_all("tr")
AttributeError: 'NoneType' object has no attribute 'find_all'

然而，这两个网站的结构似乎相同。

Answer 1

此行返回无

tb = soup.find('table', class_='tablesorter')

这意味着，页面上没有带有 class 'tablesorter' 的表格元素。

Answer 2

如果您打印出page.content ，您可以看到这是返回的 HTML： '<head><title>Not Acceptable.</title></head><body><h1>Not Acceptable.</h1><p>An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.</p></body></html>' '<head><title>Not Acceptable.</title></head><body><h1>Not Acceptable.</h1><p>An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.</p></body></html>'

因此，该网站出于某种原因阻止了您的 web 抓取并返回此错误 HTML，其中您要查找的table元素不存在，导致您的错误

Beautiful Soup AttributeError: 'NoneType' object has no attribute 'find_all' 即使网页结构相同

问题描述

2 个解决方案

解决方案1
0 2020-05-02 00:33:09

解决方案2
0 2020-05-02 00:38:39

Beautiful Soup AttributeError: 'NoneType' object has no attribute 'find_all' 即使网页结构相同

问题描述

2 个解决方案

解决方案1 0 2020-05-02 00:33:09

解决方案2 0 2020-05-02 00:38:39

解决方案1
0 2020-05-02 00:33:09

解决方案2
0 2020-05-02 00:38:39