如何提取div标签中的强元素

Question

我是网络抓取的新手。 我正在使用 Python 来抓取数据。 有人可以帮助我如何从以下位置提取数据：

<div class="dept"><strong>LENGTH:</strong> 15 credits</div>

我的输出应该是 LENGTH: 15 credits

这是我的代码：

from urllib.request import urlopen
from bs4 import BeautifulSoup 

length=bsObj.findAll("strong")
for leng in length:
    print(leng.text,leng.next_sibling)

输出：

DELIVERY:  Campus
LENGTH:  2 years
OFFERED BY:  Olin Business School

但我只想有长度。

网站： http : //www.mastersindatascience.org/specialties/business-analytics/

Answer 1

您应该稍微改进代码以通过文本定位strong元素：

soup.find("strong", text="LENGTH:").next_sibling

或者，对于多个长度：

for length in soup.find_all("strong", text="LENGTH:"):
    print(length.next_sibling.strip())

演示：

>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> url = "http://www.mastersindatascience.org/specialties/business-analytics/"
>>> response = requests.get(url)
>>> soup = BeautifulSoup(response.content, "html.parser")
>>> for length in soup.find_all("strong", text="LENGTH:"):
...     print(length.next_sibling.strip())
... 
33 credit hours
15 months
48 Credits
...
12 months
1 year

Answer 2

如果有人还在寻找这个，这里是例子： age = soup.find('div', class_ = 'item-birthday').find('strong').get_text()这意味着，获取强元素，即在div里面。

如何提取div标签中的强元素

问题描述

2 个解决方案

解决方案1
5 2016-08-21 22:46:29

解决方案2
-1 2019-10-22 18:07:03

如何提取div标签中的强元素

问题描述

2 个解决方案

解决方案1 5 2016-08-21 22:46:29

解决方案2 -1 2019-10-22 18:07:03

解决方案1
5 2016-08-21 22:46:29

解决方案2
-1 2019-10-22 18:07:03