简体   繁体   中英

find_all on span tag in Beautiful Soup yields AttributeError: ResultSet object has no attribute 'get_text'

Warning: this is only my second attempt at Python code so I may be making errors that will cause distress to a professional:

I'd like to get a list of cities using 'addressLocality' from the set of results in soup_r:

import requests
from bs4 import BeautifulSoup
URL = 'https://www.tjhughes.co.uk/map'
page = requests.get(URL, verify=False)
soup_r = BeautifulSoup(page.text, 'html.parser')

this is the type of result I'd like with just the name of the city (in this case = Bradford)

single_span = soup_r.find('span',itemprop = 'addressLocality').get_text()

I'd like to be able to return the full list of results in the same format as single_span (ie by isoloating the city name) but the following code gives me the error "AttributeError: ResultSet object has no attribute 'get_text'"

spans_fail = soup_r.find_all('span',itemprop = 'addressLocality').get_text()

The nearest I can get is by dropping the get_text():

spans = soup_r.find_all('span',itemprop = 'addressLocality')

...thus returning the results in one bundle:

[<span itemprop="addressLocality">Bradford</span>, <span itemprop="addressLocality">Birkenhead</span>, <span itemprop="addressLocality">Bootle</span>, <span itemprop="addressLocality">Bury</span>,
...
<span itemprop="addressLocality">Sheffield</span>, <span itemprop="addressLocality">St Helens</span>, <span itemprop="addressLocality">Widnes</span>]

Assuming this is the best I can do, I still get tied in knots when I try to re-arrange the results.

For instance this just returns Bradford 52 times which baffles me because there are only 26 cities in the original list so I don't know how I'm doubling up, let alone how to access the other 25:

cities = []
for check in soup:
    check = soup.find('span',itemprop = 'addressLocality').text
    cities.append(check)

I was looking for an elegantly simple solution, and I appreciate that I might need a workaround, but I can't see how else to approach this and so any input is appreciated.

When you get down to a list of single elements sometimes you have to do string chopping.

spans = soup_r.find_all('span',itemprop = 'addressLocality')

# [<span itemprop="addressLocality">Bradford</span>, <span 

cities = []
for span in spans:
    left_angle=span.find('>'+1)
    sec_rangle=spane.find('<',1)
    city=span[left_angle:sec_rangle]
    print(city)
    cities.append(city)
print(cities)

You can use list comprehension to obtain your list of cities.

For example:

import requests
from bs4 import BeautifulSoup
URL = 'https://www.tjhughes.co.uk/map'
page = requests.get(URL, verify=False)
soup_r = BeautifulSoup(page.text, 'html.parser')

cities = [span.get_text() for span in soup_r.select('span[itemprop="addressLocality"]')]
print(cities)

Prints:

['Bradford', 'Birkenhead', 'Bootle', 'Bury', 'Chelmsford', 'Chesterfield', 'Glasgow', 'Cumbernauld', 'London', 'Coventry', 'Dundee', 'Durham', 'East Kilbride', 'Glasgow', 'Harlow', 'Hartlepool', 'Liverpool', 'Maidstone', 'Middlesbrough', 'Newcastle upon Tyne', 'Nuneaton', 'Oldham', 'Preston', 'Sheffield', 'St Helens', 'Widnes']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM