簡體   English   中英

BeautifulSoup AttributeError: 'NavigableString' 對象沒有屬性 'find_all'

[英]BeautifulSoup AttributeError: 'NavigableString' object has no attribute 'find_all'

我正在嘗試從該 URL 中抓取數據並在我的抓取器的這一部分出現錯誤,完整的代碼塊如下

if table.find_all('tr'):

注意我之前在沒有if/elif/else邏輯的情況下構建了它,只是find_all('tr') ,但它產生了相同的錯誤

Traceback (most recent call last):
  File "statbunker.py", line 217, in <module>
    if table.find_all('tr'):        
  File "/opt/miniconda3/envs/ds383/lib/python3.8/site-packages/bs4/element.py", line 921, in __getattr__
    raise AttributeError(
AttributeError: 'NavigableString' object has no attribute 'find_all'
    link = 'https://rugby.statbunker.com/competitions/MatchDetails/World-Cup-2019/Japan-VS-Russia?comp_id=606&match_id=39737&date=20-Sep-2019'
    response = requests.get(link)
    html_loop = response.content
    soup_loop = BeautifulSoup(html_loop, 'html.parser')

    home_substititions = soup_loop.find('table', {'id': 'homeSubs'})
    for table in home_substititions.find('tbody'):
        if table.find_all('tr'):        
            for row in table.find_all('tr'):
                substitutionEvent = {}
                substitutionEvent['uuid'] = uuid.uuid1()
                substitutionEvent['playerIn'] = row.find_all('td')[2].text
                substitutionEvent['playerOut'] = row.find_all('td')[4].text
                if int(row.find_all('td')[0].text.split('`')[0]):
                    substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
                else:
                    substitutionEvent['subTime'] = ''
                homeSubstitutionEvents.append(substitutionEvent)
        elif table.find('tr'):
            for row in table.find('tr'):
                substitutionEvent = {}
                substitutionEvent['uuid'] = uuid.uuid1()
                substitutionEvent['playerIn'] = row.find_all('td')[2].text
                substitutionEvent['playerOut'] = row.find_all('td')[4].text
                if int(row.find_all('td')[0].text.split('`')[0]):
                    substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
                else:
                    substitutionEvent['subTime'] = ''
                homeSubstitutionEvents.append(substitutionEvent)
        else:
            continue

在這里,使用.find()將只返回標簽/可導航.find_all()組合,您必須使用.find_all()進行迭代:

home_substititions = soup_loop.find_all('table', {'id': 'homeSubs'})
    for table in home_substititions:
        # ....

做了一些小改動。 問題是你把for table in yourstuff.find('sth')但是你只找到一個元素所以不需要循環

import requests
from bs4 import BeautifulSoup

link = 'https://rugby.statbunker.com/competitions/MatchDetails/World-Cup-2019/Japan-VS-Russia?comp_id=606&match_id=39737&date=20-Sep-2019'
response = requests.get(link)
html_loop = response.content
soup_loop = BeautifulSoup(html_loop, 'html.parser')

home_substititions = soup_loop.find('table', {'id': 'homeSubs'})
table = home_substititions.find('tbody')
print(table)
if table.find_all('tr'):
    for row in table.find_all('tr'):
        substitutionEvent = {}
        substitutionEvent['uuid'] = uuid.uuid1()
        substitutionEvent['playerIn'] = row.find_all('td')[2].text
        substitutionEvent['playerOut'] = row.find_all('td')[4].text
        if int(row.find_all('td')[0].text.split('`')[0]):
            substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
        else:
            substitutionEvent['subTime'] = ''
        homeSubstitutionEvents.append(substitutionEvent)
elif table.find('tr'):
    for row in table.find('tr'):
        substitutionEvent = {}
        substitutionEvent['uuid'] = uuid.uuid1()
        substitutionEvent['playerIn'] = row.find_all('td')[2].text
        substitutionEvent['playerOut'] = row.find_all('td')[4].text
        if int(row.find_all('td')[0].text.split('`')[0]):
            substitutionEvent['subTime'] = game['gameTime'] + timedelta.Timedelta(minutes=int(row.find_all('td')[0].text.split('`')[0]))
        else:
            substitutionEvent['subTime'] = ''
        homeSubstitutionEvents.append(substitutionEvent)
else:
    pass

問題是 home_substititions 不是 BeautifulSoup 類,而不是soup_loop

type (home_substititions)
type (soup_loop)

輸出將是

<class 'bs4.element.Tag'>
<class 'bs4.BeautifulSoup'>

為了使您的代碼正常工作,您需要將 find 和 find_all 應用於原始湯

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM