Python3 beautifulsoup模块“ NoneType”错误

Question

I'm new to beautifulsoup module and I have a problem. 我是beautifulsoup模块的新手，但我遇到了问题。 My code is simple. 我的代码很简单。 Before all, the site I'm trying to scrape from is this and I am trying to scrape the price. 毕竟，我要从中刮取的网站是这个，而我要刮除价格。 (The big number two (2) with more of it) （第二大（2），其中有更多）

My code: 我的代码：

import urllib
from bs4 import BeautifulSoup


quote_page = 'https://www.bloomberg.com/quote/SPX:IND'

page = urllib.request.urlopen(quote_page)

soup = BeautifulSoup(page, 'html.parser')

price_box = soup.find('div', attr = {'class': 'price'})
price = price_box.text

print(price)

The error I get: 我得到的错误：

price = price_box.text

AttributeError: 'NoneType' object has no attribute 'text'

Answer 1

I have used a more robust CSS Selector instead of the find methods. 我使用了更强大的CSS选择器代替了find方法。 Since there is only one div element with class price , I am guessing this is the right element. 由于只有一个带有class price div元素，我猜这是正确的元素。

import requests
from bs4 import BeautifulSoup

response = requests.get('https://www.bloomberg.com/quote/SPX:IND')
soup = BeautifulSoup(response.content, 'lxml')
price = soup.select_one('.price').text
print(price)

Answer 2

Another solution: 另一个解决方案：

from bs4 import BeautifulSoup
from requests import Session

session = Session()
session.headers['user-agent'] = (
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
    'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/'
    '66.0.3359.181 Safari/537.36'
)

quote_page = 'https://www.bloomberg.com/quote/SPX:IND'

page= session.get(quote_page)

soup = BeautifulSoup(page.text, 'html.parser')

price_box = soup.find('meta', itemprop="price")

price = float(price_box['content'])

print(price)

Python3 beautifulsoup模块“ NoneType”错误

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-06-02 22:30:20

解决方案2
0 2018-06-02 22:36:17

Python3 beautifulsoup模块“ NoneType”错误

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-06-02 22:30:20

解决方案2 0 2018-06-02 22:36:17

解决方案1
2 已采纳 2018-06-02 22:30:20

解决方案2
0 2018-06-02 22:36:17