[英]How can I fix this attribute error with my novice webscraping code?
我正在嘗試通過按特定位置抓取 NFL 球員的身高和體重來學習網絡抓取。
這是我的代碼:
import requests
from bs4 import BeautifulSoup
# Base URL for the NFL player stats page
base_url = 'https://www.pro-football-reference.com/players/'
# List to store player data
player_data = []
# Loop through the years 2014 to 2021
for year in range(2014, 2022):
# Send a GET request to the URL
response = requests.get(f'{base_url}{year}/')
# Parse the HTML of the page
soup = BeautifulSoup(response.text, 'html.parser')
# Find all rows in the player stats table
rows = soup.find('table', {'id': 'players'}).tbody.find_all('tr')[1:]
# Loop through each row
for row in rows:
# Find the player name cell
name_cell = row.find('th')
# Check if the cell is valid (some rows may not have player data)
if name_cell:
# Extract the player name and link
try:
name = name_cell.a.text
except AttributeError:
name = ''
#35
try:
position = row.find('td', {'data-stat': 'position'}).text
except AttributeError:
position = ''
try:
link = name_cell.a['href']
except AttributeError:
link = ''
# Extract the player height and weight
try:
height = row.find('td', {'data-stat': 'height'}).text
except AttributeError:
height = ''
try:
weight = row.find('td', {'data-stat': 'weight'}).text
except AttributeError:
weight = ''
# Add the player data to the list
player_data.append({
'name': name,
'position': position,
'link': link,
'height': height,
'weight': weight
})
# Print the player data
print(player_data)
c.execute(player_data)
getAll('player_data',c)
querySave(player_data, c, 'NFLHeightWeight')
print("Done!")
我收到錯誤:AttributeError: 'NoneType' object 沒有屬性 'tbody'
我在其他問題中看到過這個錯誤,但解決方案並沒有真正起作用。
我該如何針對我的特定情況解決此問題? 我試圖確保我正在搜索的內容不為空。
謝謝!
在嘗試查找它的tbody
之前檢查您是否找到了該表。
for year in range(2014, 2022):
# Send a GET request to the URL
response = requests.get(f'{base_url}{year}/')
# Parse the HTML of the page
soup = BeautifulSoup(response.text, 'html.parser')
table = soup.find('table', {'id': 'players'})
if table:
rows = table.tbody.find_all('tr')[1:]
else:
print(f"No players table found for year {year}")
continue
# rest of loop here
此外, for row in rows:
循環需要縮進,因此它位於for year in range(2014, 2022):
循環內。 否則它將只使用循環中去年的rows
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.