网络抓取python bs4中的属性错误

Question

我已经编写了一个用于网络抓取的 python 代码，它似乎一切正常，但是当我运行此代码时，我收到一个“AttributeError: 'NoneType' object has no attribute 'text'”所以请查看并指导我如何修复这种类型的错误。谢谢

这是我的代码：

import pandas
from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
url = 'https://www.realtor.com/realestateandhomes-search/Orlando_FL/dom-1'

r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
linklist = []
urls = soup.find_all('div', class_ = 'jsx-4195823209 photo-wrap')
for url in urls:
    for link in url.find_all('a', href=True):
        linklist.append('https://www.realtor.com' + link['href'])
#print(linklist)

testurl = 'https://www.realtor.com/realestateandhomes-detail/127-W-Wallace-St_Orlando_FL_32809_M62756-65861'

r = requests.get(testurl, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
print(address)
name = soup.find('a', class_ = 'jsx-725757796 agent-name').text.strip()
print(street)

Answer 1

可能的问题在于以下任一行：

address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
name = soup.find('a', class_ = 'jsx-725757796 agent-name').text.strip()

您正在尝试访问无法保证每次脚本运行时都不是NoneType的对象的属性

您有其他选择，例如使用像这样的 try/except 块

try:
    address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
except Exception as e:
    print('The following error occurred getting the text from the address: %r', e)

这种异常处理方法是通用的； 你可以像这样更具体：

try:
    address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
except AttributeError:
    print('Could not get text from address')

本质上，您需要对以下内容进行一些验证：

如果请求失败怎么办？
如果网页上的类名更改/不存在怎么办

网络抓取python bs4中的属性错误

问题描述

1 个解决方案

解决方案1
0 2020-11-21 05:39:47

网络抓取python bs4中的属性错误

问题描述

1 个解决方案

解决方案1 0 2020-11-21 05:39:47

解决方案1
0 2020-11-21 05:39:47