簡體   English   中英

網絡抓取python bs4中的屬性錯誤

[英]Attribute error in web scraping python bs4

我已經編寫了一個用於網絡抓取的 python 代碼,它似乎一切正常,但是當我運行此代碼時,我收到一個“AttributeError: 'NoneType' object has no attribute 'text'”所以請查看並指導我如何修復這種類型的錯誤。 謝謝

這是我的代碼:

import pandas
from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
url = 'https://www.realtor.com/realestateandhomes-search/Orlando_FL/dom-1'

r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
linklist = []
urls = soup.find_all('div', class_ = 'jsx-4195823209 photo-wrap')
for url in urls:
    for link in url.find_all('a', href=True):
        linklist.append('https://www.realtor.com' + link['href'])
#print(linklist)

testurl = 'https://www.realtor.com/realestateandhomes-detail/127-W-Wallace-St_Orlando_FL_32809_M62756-65861'

r = requests.get(testurl, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
print(address)
name = soup.find('a', class_ = 'jsx-725757796 agent-name').text.strip()
print(street)

可能的問題在於以下任一行:

address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
name = soup.find('a', class_ = 'jsx-725757796 agent-name').text.strip()

您正在嘗試訪問無法保證每次腳本運行時都不是NoneType的對象的屬性

您有其他選擇,例如使用像這樣的 try/except 塊

try:
    address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
except Exception as e:
    print('The following error occurred getting the text from the address: %r', e)

這種異常處理方法是通用的; 你可以像這樣更具體:

try:
    address = soup.find('div', class_='jsx-1959108432 address-section').h1.text
except AttributeError:
    print('Could not get text from address')

本質上,您需要對以下內容進行一些驗證:

  • 如果請求失敗怎么辦?
  • 如果網頁上的類名更改/不存在怎么辦

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM