Python：AttributeError：'NoneType'对象没有属性'findNext'

Question

I am trying to scrape a website with BeautifulSoup but am having a problem.我正在尝试使用 BeautifulSoup 抓取网站，但遇到了问题。 I was following a tutorial done in python 2.7 and it had exactly the same code in it and had no problems.我正在学习在 python 2.7 中完成的教程，其中包含完全相同的代码并且没有任何问题。

import urllib.request
from bs4 import *


htmlfile = urllib.request.urlopen("http://en.wikipedia.org/wiki/Steve_Jobs")

htmltext = htmlfile.read()

soup = BeautifulSoup(htmltext)
title = (soup.title.text)

body = soup.find("Born").findNext('td')
print (body.text)

If I try to run the program I get,如果我尝试运行我得到的程序，

Traceback (most recent call last):
  File "C:\Users\USER\Documents\Python Programs\World Population.py", line 13, in <module>
    body = soup.find("Born").findNext('p')
AttributeError: 'NoneType' object has no attribute 'findNext'

Is this a problem with python 3 or am i just too naive?这是python 3的问题还是我太天真了？

Answer 1

The find and find_all methods do not search for arbitrary text in the document, they search for HTML tags. find和find_all方法不会搜索文档中的任意文本，而是搜索HTML 标签。 The documentation makes that clear (my italics):文档清楚地说明了这一点（我的斜体）：

Pass in a value for name and you'll tell Beautiful Soup to only consider tags with certain names.传入 name 的值，您将告诉 Beautiful Soup 仅考虑具有特定名称的标签。 Text strings will be ignored, as will tags whose names that don't match.文本字符串将被忽略，名称不匹配的标签也将被忽略。 This is the simplest usage:这是最简单的用法：

soup.find_all("title")
# [<title>The Dormouse's story</title>]

That's why your soup.find("Born") is returning None and hence why it complains about NoneType (the type of None ) having no findNext() method.这就是为什么您的soup.find("Born")返回None原因，因此它抱怨NoneType （ None的类型）没有findNext()方法。

That page you reference contains (at the time this answer was written) eight copies of the word "born", none of which are tags.您引用的页面包含（在撰写此答案时）“出生”一词的八个副本，其中没有一个是标签。

Looking at the HTML source for that page, you'll find the best option may be to look for the correct span (formatted for readabilty):查看该页面的 HTML 源代码，您会发现最好的选择可能是寻找正确的跨度（格式化为可读性）：

<th scope="row" style="text-align: left;">Born</th>
<td>
    <span class="nickname">Steven Paul Jobs</span><br />
    <span style="display: none;">(<span class="bday">1955-02-24</span>)</span>February 24, 1955<br />
</td>

Answer 2

The find method looks for tags, not text. find方法查找标签，而不是文本。 To find the name, birthday and birthplace, you would have to look up the span elements with the corresponding class name, and access the text attribute of that item:要查找姓名、生日和出生地，您必须查找具有相应类名的span元素，并访问该项目的text属性：

import urllib.request
from bs4 import *


soup = BeautifulSoup(urllib.request.urlopen("http://en.wikipedia.org/wiki/Steve_Jobs"))
title = soup.title.text
name = soup.find('span', {'class': 'nickname'}).text
bday = soup.find('span', {'class': 'bday'}).text
birthplace = soup.find('span', {'class': 'birthplace'}).text

print(name)
print(bday)
print(birthplace)

Output:输出：

Steven Paul Jobs
1955-02-24
San Francisco, California, US

PS: You don't have to call read on urlopen , BS accept file-like objects. PS：您不必在urlopen上调用read ，BS 接受类文件对象。

Python：AttributeError：'NoneType'对象没有属性'findNext'

问题描述

2 个解决方案

解决方案1
9 已采纳 2014-01-29 03:33:59

解决方案2
6 2014-01-29 03:36:50

Python：AttributeError：&#39;NoneType&#39;对象没有属性&#39;findNext&#39;

问题描述

2 个解决方案

解决方案1 9 已采纳 2014-01-29 03:33:59

解决方案2 6 2014-01-29 03:36:50

Python：AttributeError：'NoneType'对象没有属性'findNext'

解决方案1
9 已采纳 2014-01-29 03:33:59

解决方案2
6 2014-01-29 03:36:50