简体   繁体   English

用美丽的汤解析HTML span

[英]Parsing HTML span with Beautiful Soup

I'm trying to figure out how to use Beautiful Soup and am having a hard time. 我试图弄清楚如何使用美丽汤,并且遇到了困难。

My HTML page has several elements that look like this: 我的HTML页面具有以下几个元素:

<a class="propertyName" href="/preferredguest/property/overview/index.html?propertyID=1023"><span>The Westin Peachtree Plaza, Atlanta
</span></a>

<a class="propertyName" href="/preferredguest/property/overview/index.html?propertyID=1144"><span>Sheraton Atlanta Hotel
</span></a>

I'm trying to create an array with the hotel names. 我正在尝试使用酒店名称创建一个数组。 Here is my code so far: 到目前为止,这是我的代码:

import requests
from bs4 import BeautifulSoup

url = "removed"
response = requests.get(url)
soup = BeautifulSoup(response.text)

hotels = soup.find_all('a', class_="propertyName")

But I cannot figure out how to iterate over the hotels array to display the span element. 但是我不知道如何遍历Hotels数组以显示span元素。

Your "hotel" name are inside a span . 您的“旅馆”名称在span One way is using the .select() method 一种方法是使用.select()方法

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''<a class="propertyName" href="/preferredguest/property/overview/index.html?propertyID=1023"><span>The Westin Peachtree Plaza, Atlanta
... </span></a>
... 
... <a class="propertyName" href="/preferredguest/property/overview/index.html?propertyID=1144"><span>Sheraton Atlanta Hotel
... </span></a>
... ''', 'lxml')
>>> [element.get_text(strip=True) for element in soup.select('a.propertyName > span')]
['The Westin Peachtree Plaza, Atlanta', 'Sheraton Atlanta Hotel']
>>> 

or 要么

>>> names = []
>>> for el in hotels:
...     names.append(el.find('span').get_text(strip=True))
... 
>>> names
['The Westin Peachtree Plaza, Atlanta', 'Sheraton Atlanta Hotel']
>>> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM