[英]BeautifulSoup - AttributeError: 'NavigableString' object has no attribute 'find_all'
[英]BeautifulSoup AttributeError: 'NavigableString' object has no attribute 'find_all'
我正在嘗試從該 URL 中抓取數據並在我的抓取器的這一部分出現錯誤,完整的代碼塊如下
if table.find_all('tr'):
注意我之前在沒有if/elif/else
邏輯的情況下構建了它,只是find_all('tr')
,但它產生了相同的錯誤
Traceback (most recent call last):
File "statbunker.py", line 217, in <module>
if table.find_all('tr'):
File "/opt/miniconda3/envs/ds383/lib/python3.8/site-packages/bs4/element.py", line 921, in __getattr__
raise AttributeError(
AttributeError: 'NavigableString' object has no attribute 'find_all'
link = 'https://rugby.statbunker.com/competitions/MatchDetails/World-Cup-2019/Japan-VS-Russia?comp_id=606&match_id=39737&date=20-Sep-2019'
response = requests.get(link)
html_loop = response.content
soup_loop = BeautifulSoup(html_loop, 'html.parser')
home_substititions = soup_loop.find('table', {'id': 'homeSubs'})
for table in home_substititions.find('tbody'):
if table.find_all('tr'):
for row in table.find_all('tr'):
substitutionEvent = {}
substitutionEvent['uuid'] = uuid.uuid1()
substitutionEvent['playerIn'] = row.find_all('td')[2].text
substitutionEvent['playerOut'] = row.find_all('td')[4].text
if int(row.find_all('td')[0].text.split('`')[0]):
substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
else:
substitutionEvent['subTime'] = ''
homeSubstitutionEvents.append(substitutionEvent)
elif table.find('tr'):
for row in table.find('tr'):
substitutionEvent = {}
substitutionEvent['uuid'] = uuid.uuid1()
substitutionEvent['playerIn'] = row.find_all('td')[2].text
substitutionEvent['playerOut'] = row.find_all('td')[4].text
if int(row.find_all('td')[0].text.split('`')[0]):
substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
else:
substitutionEvent['subTime'] = ''
homeSubstitutionEvents.append(substitutionEvent)
else:
continue
在這里,使用.find()
將只返回標簽/可導航.find_all()
組合,您必須使用.find_all()
進行迭代:
home_substititions = soup_loop.find_all('table', {'id': 'homeSubs'})
for table in home_substititions:
# ....
做了一些小改動。 問題是你把for table in yourstuff.find('sth')
但是你只找到一個元素所以不需要循環
import requests
from bs4 import BeautifulSoup
link = 'https://rugby.statbunker.com/competitions/MatchDetails/World-Cup-2019/Japan-VS-Russia?comp_id=606&match_id=39737&date=20-Sep-2019'
response = requests.get(link)
html_loop = response.content
soup_loop = BeautifulSoup(html_loop, 'html.parser')
home_substititions = soup_loop.find('table', {'id': 'homeSubs'})
table = home_substititions.find('tbody')
print(table)
if table.find_all('tr'):
for row in table.find_all('tr'):
substitutionEvent = {}
substitutionEvent['uuid'] = uuid.uuid1()
substitutionEvent['playerIn'] = row.find_all('td')[2].text
substitutionEvent['playerOut'] = row.find_all('td')[4].text
if int(row.find_all('td')[0].text.split('`')[0]):
substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
else:
substitutionEvent['subTime'] = ''
homeSubstitutionEvents.append(substitutionEvent)
elif table.find('tr'):
for row in table.find('tr'):
substitutionEvent = {}
substitutionEvent['uuid'] = uuid.uuid1()
substitutionEvent['playerIn'] = row.find_all('td')[2].text
substitutionEvent['playerOut'] = row.find_all('td')[4].text
if int(row.find_all('td')[0].text.split('`')[0]):
substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
else:
substitutionEvent['subTime'] = ''
homeSubstitutionEvents.append(substitutionEvent)
else:
pass
問題是 home_substititions 不是 BeautifulSoup 類,而不是soup_loop
type (home_substititions)
type (soup_loop)
輸出將是
<class 'bs4.element.Tag'>
<class 'bs4.BeautifulSoup'>
為了使您的代碼正常工作,您需要將 find 和 find_all 應用於原始湯
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.