[英]Put all dates from HTML tag <span> into a list using BeatifulSoup
There is my HTML file:有我的 HTML 文件:
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>Today 10:52</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 11</span>
</small>]
[<small class="breadcrumb x-normal">
<span><i data-icon="clock"></i>April 5</span>
</small>]
<span><i data-icon="clock"></i>February 29</span>
</small>]
How do I put all these dates into a list.我如何将所有这些日期放入列表中。
Here it is my code:这是我的代码:
from bs4 import BeautifulSoup
import lxml
def get_dates(html):
soup = BeautifulSoup(html, 'lxml')
dates = soup.pass
print (date)
get_dates(html.text)
Example例子
from bs4 import BeautifulSoup
html = '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>Today 10:52</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 11</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 5</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>February 29</span></small>'
soup = BeautifulSoup(html, features="lxml")
date_list = []
dates = soup.find_all('small', {'class':'breadcrumb x-normal'})
for date in dates:
print(date.text)
date_list.append(date.text)
print(date_list)
from bs4 import BeautifulSoup
html = '<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>Today 10:52</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 11</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>April 5</span></small>' \
'<small class="breadcrumb x-normal"><span><i data-icon="clock"></i>February 29</span></small>'
soup = BeautifulSoup(html, 'html.parser')
data = [item.next_element for item in soup.findAll(
"i", {'data-icon': 'clock'})]
print(data)
Output: Output:
['Today 10:52', 'April 11', 'April 5', 'February 29']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.