简体   繁体   中英

BeautifulSoup - AttributeError: 'NavigableString' object has no attribute 'find_all'

Trying to get this script to iterate through the html file and print out the desired results. It keeps giving me this error. It works fine with only one "game" in the table, but if it is more than one it breaks. Trying to fix it so it can iterate over more than one game/parking ticket but can't continue due to this.

Traceback (most recent call last):
  File "C:/Users/desktop/Desktop/tabletest.py", line 11, in <module>
    for rows in table.find_all('tr'):
  File "C:\Program Files\Python36\lib\site-packages\bs4\element.py", line 737, in __getattr__
    self.__class__.__name__, attr))
AttributeError: 'NavigableString' object has no attribute 'find_all'

This is my code:

import pandas as pd
from bs4 import BeautifulSoup
import requests
import lxml.html as lh


with open("htmltabletest.html", encoding="utf-8") as f:
    data = f.read()
    soup = BeautifulSoup(data, 'lxml')
    for table in soup.find('table', attrs={'id': 'eventSearchTable'}):
        for rows in table.find_all('tr'):
            cols = table.find_all('td')

            empty = cols[0].get_text()
            eventdate = cols[1].get_text()
            eventname = cols[2].get_text()
            tickslisted = cols[3].get_text()
            pricerange = cols[4].get_text()

            entry = (empty, eventdate, eventname, tickslisted, pricerange)

            print(entry)

This is whats in the html file:

<table class="dataTable st-alternateRows" id="eventSearchTable">
<thead>
<tr>
<th id="th-es-rb"><div class="dt-th"> </div></th>
<th id="th-es-ed"><div class="dt-th"><span class="th-divider"> </span>Event date<br/>Time (local)</div></th>
<th id="th-es-en"><div class="dt-th"><span class="th-divider"> </span>Event name<br/>Venue</div></th>
<th id="th-es-ti"><div class="dt-th"><span class="th-divider"> </span>Tickets<br/>listed</div></th>
<th id="th-es-pr"><div class="dt-th es-lastCell"><span class="th-divider"> </span>Price<br/>range</div></th>
</tr>
</thead>
<tbody class="" id="eventSearchTbody"><tr class="even" id="r-se-103577924">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577924-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577924-eventDateTime">Thu, 10/11/2018<br/>8:20 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577924&amp;sectionId=0" id="se-103577924-eventName" target="_blank">Philadelphia Eagles at New York Giants</a></div><div id="se-103577924-venue">MetLife Stadium, East Rutherford, NJ</div></td>
<td id="se-103577924-nrTickets">6655</td>
<td class="es-lastCell nowrap" id="se-103577924-priceRange"><span id="se-103577924-minPrice">$134.50</span>  to<br/><span id="se-103577924-maxPrice">$2,222.50</span></td>
</tr><tr class="odd" id="r-se-103577925">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577925-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577925-eventDateTime">Thu, 10/11/2018<br/>8:21 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577925&amp;sectionId=0" id="se-103577925-eventName" target="_blank">PARKING PASSES ONLY Philadelphia Eagles at New York Giants</a></div><div id="se-103577925-venue">MetLife Stadium Parking Lots, East Rutherford, NJ</div></td>
<td id="se-103577925-nrTickets">929</td>
<td class="es-lastCell nowrap" id="se-103577925-priceRange"><span id="se-103577925-minPrice">$20.39</span>  to<br/><span id="se-103577925-maxPrice">$3,602.50</span></td>
</tr></tbody>
</table>

The error lies in the way you iterate on the table, more specifically at the line:

for table in soup.find('table', attrs={'id': 'eventSearchTable'}):

You should use find_all if you want to iterate. Indeed, if you look at the type of the value returned by the two methods:

print(type(soup.find('table', attrs={'id': 'eventSearchTable'})))
# <class 'bs4.element.Tag'>
print(type(soup.find_all('table', attrs={'id': 'eventSearchTable'})))
# <class 'bs4.element.ResultSet'>

in the first case you have a table, in the second case a set of tables (made by only 1 in your case) with each being of type bs4.element.Tag .

Thus, you have two options, either you use

table = soup.find('table', attrs={'id': 'eventSearchTable'})

or

for table in soup.find_all("table", {"id":"eventSearchTable"}):

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM