Warning: this is only my second attempt at Python code so I may be making errors that will cause distress to a professional:
I'd like to get a list of cities using 'addressLocality' from the set of results in soup_r:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.tjhughes.co.uk/map'
page = requests.get(URL, verify=False)
soup_r = BeautifulSoup(page.text, 'html.parser')
this is the type of result I'd like with just the name of the city (in this case = Bradford)
single_span = soup_r.find('span',itemprop = 'addressLocality').get_text()
I'd like to be able to return the full list of results in the same format as single_span (ie by isoloating the city name) but the following code gives me the error "AttributeError: ResultSet object has no attribute 'get_text'"
spans_fail = soup_r.find_all('span',itemprop = 'addressLocality').get_text()
The nearest I can get is by dropping the get_text():
spans = soup_r.find_all('span',itemprop = 'addressLocality')
...thus returning the results in one bundle:
[<span itemprop="addressLocality">Bradford</span>, <span itemprop="addressLocality">Birkenhead</span>, <span itemprop="addressLocality">Bootle</span>, <span itemprop="addressLocality">Bury</span>,
...
<span itemprop="addressLocality">Sheffield</span>, <span itemprop="addressLocality">St Helens</span>, <span itemprop="addressLocality">Widnes</span>]
Assuming this is the best I can do, I still get tied in knots when I try to re-arrange the results.
For instance this just returns Bradford 52 times which baffles me because there are only 26 cities in the original list so I don't know how I'm doubling up, let alone how to access the other 25:
cities = []
for check in soup:
check = soup.find('span',itemprop = 'addressLocality').text
cities.append(check)
I was looking for an elegantly simple solution, and I appreciate that I might need a workaround, but I can't see how else to approach this and so any input is appreciated.
When you get down to a list of single elements sometimes you have to do string chopping.
spans = soup_r.find_all('span',itemprop = 'addressLocality')
# [<span itemprop="addressLocality">Bradford</span>, <span
cities = []
for span in spans:
left_angle=span.find('>'+1)
sec_rangle=spane.find('<',1)
city=span[left_angle:sec_rangle]
print(city)
cities.append(city)
print(cities)
You can use list comprehension to obtain your list of cities.
For example:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.tjhughes.co.uk/map'
page = requests.get(URL, verify=False)
soup_r = BeautifulSoup(page.text, 'html.parser')
cities = [span.get_text() for span in soup_r.select('span[itemprop="addressLocality"]')]
print(cities)
Prints:
['Bradford', 'Birkenhead', 'Bootle', 'Bury', 'Chelmsford', 'Chesterfield', 'Glasgow', 'Cumbernauld', 'London', 'Coventry', 'Dundee', 'Durham', 'East Kilbride', 'Glasgow', 'Harlow', 'Hartlepool', 'Liverpool', 'Maidstone', 'Middlesbrough', 'Newcastle upon Tyne', 'Nuneaton', 'Oldham', 'Preston', 'Sheffield', 'St Helens', 'Widnes']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.