Python 3 'NoneType' 对象没有属性 'text'

Question

# import libraries
from urllib.request import urlopen
from bs4 import BeautifulSoup

#specify the url
html = 'https://www.bloomberg.com/quote/SPX:IND'

# query the website and return the html to thevariable 'page'
page = urlopen(html)

# parse the html using beautiful soup and store in variable 'soup'
data = BeautifulSoup(page, 'html.parser')

#take out the <div> of name and get its value
name_box = data.find('h1', attrs={'class': 'companyName_99a4824b'})

name = name_box.text.strip() #strip is used to remove starting and trailing
print (name)

# get the index price
price_box = data.find('div', attrs={'class':'priceText_1853e8a5'})
price = price_box.text
print (price)

I was following a guide on medium.com here and was having some conflictions due to lacking of knowledge of python and scripting, but I think I have my error at我在这里遵循了 medium.com 上的指南，并且由于缺乏 Python 和脚本知识而遇到了一些冲突，但我认为我有错误

name = name_box.text名称 = name_box.text

because text is not defined and I am unsure they would like me to define it using the BeautifulSoup library.因为文本未定义，我不确定他们是否希望我使用 BeautifulSoup 库来定义它。 Any help maybe appreciated.任何帮助可能会受到赞赏。 The actual error will be below实际错误将在下面

 RESTART: C:/Users/Parsons PC/AppData/Local/Programs/Python/Python36-32/projects/Scripts/S&P 500 website scraper/main.py 
Traceback (most recent call last):
  File "C:/Users/Parsons PC/AppData/Local/Programs/Python/Python36-32/projects/Scripts/S&P 500 website scraper/main.py", line 17, in <module>
    name = name_box.text.strip() #strip is used to remove starting and trailing
AttributeError: 'NoneType' object has no attribute 'text'

Answer 1

The website https://www.bloomberg.com/quote/SPX:IND does not contain a <h1> tag with the class name companyName_99a4824b .网站https://www.bloomberg.com/quote/SPX:IND不包含类名为companyName_99a4824b的<h1>标记。 That's why you are receiving the above error.这就是您收到上述错误的原因。

In the website.在网站上。 <h1> tag look like this, <h1>标签看起来像这样，

<h1 class="companyName__99a4824b">S&amp;P 500 Index</h1>

So to select it, you have to change the class name to companyName__99a4824b .因此要选择它，您必须将类名更改为companyName__99a4824b 。

name_box = data.find('h1', attrs={'class': 'companyName__99a4824b'})

Finally Result:最后结果：

# import libraries
from urllib.request import urlopen
from bs4 import BeautifulSoup

#specify the url
html = 'https://www.bloomberg.com/quote/SPX:IND'

# query the website and return the html to thevariable 'page'
page = urlopen(html)

# parse the html using beautiful soup and store in variable 'soup'
data = BeautifulSoup(page, 'html.parser')

#take out the <div> of name and get its value
name_box = data.find('h1', attrs={'class': 'companyName__99a4824b'}) #edited companyName_99a4824b -> companyName__99a4824b

name = name_box.text.strip() #strip is used to remove starting and trailing
print (name)

# get the index price
price_box = data.find('div', attrs={'class':'priceText__1853e8a5'}) #edited priceText_1853e8a5 -> priceText__1853e8a5
price = price_box.text
print (price)

It would be better if you can also handle this exception, for future class name changes.如果您也可以处理此异常，以便将来更改类名，那就更好了。

Python 3 'NoneType' 对象没有属性 'text'

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-05-05 02:03:08

Python 3 &#39;NoneType&#39; 对象没有属性 &#39;text&#39;

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-05-05 02:03:08

Python 3 'NoneType' 对象没有属性 'text'

解决方案1
0 已采纳 2018-05-05 02:03:08