[英]How can I extract data from a html tag using Python?
I want to extract the translation of a word in online dictionary. 我想提取在线词典中单词的翻译。 For example, the html code for 'car': 例如,“ car”的html代码:
<ol class="sense_list level_1">
<li class="sense_list_item level_1" value="1"><span class="def">any vehicle on wheels</span></li>
How can I extract "any vehicle on wheels" in Python with beautifulsoup or any other modules? 如何使用Beautifulsoup或任何其他模块在Python中提取“车轮上的任何车辆”?
There are multiple ways to reach the desired element. 有多种方法可以达到所需的元素。
Probably the simplest would be to find it by class
: 可能最简单的方法是按class
找到它:
soup.find('span', class_='def').text
or, with a CSS selector
: 或者,使用CSS selector
:
soup.select('span.def')[0].text
or, additionally checking the parents: 或者,另外检查父母:
soup.select('ol.level_1 > li.level_1 > span.def')[0].text
or: 要么:
soup.select('ol.level_1 > li[value=1] > span.def')[0].text
I solve it by beautifulsoup: 我通过beautifulsoup解决了这个问题:
soup = bs4.BeautifulSoup(html)
q1=soup.find('li', class_="sense_list_item level_1",value='1').text
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.