[英]Remove selected tag in an element with BeautifulSoup
In a page, we have several h1's. 在一个页面中,我们有几个h1。 In the first h1, I want to remove the tag with class
read-time
. 在第一个h1中,我想使用class
read-time
删除标签。 Here is my attempt at it. 这是我的尝试。 However, the tag is not being deleted.
但是,标签不会被删除。 Where am I going wrong?
我要去哪里错了?
h1s = main.select('h1')
print("BEFORE: main.select('h1')", main.select('h1'))
real_h1 = h1s[0]
if real_h1.select('.read-time') is not None:
real_h1.select('.read-time').clear()
print("AFTER: main.select('h1')", main.select('h1'))
log 日志
BEFORE: main.select('h1') [<h1>Introduction<span class="read-time"><span class="minutes"></span> min read</span></h1>, <h1 id="before-you-begin">Before You Begin</h1>]
AFTER: main.select('h1') [<h1>Introduction<span class="read-time"><span class="minutes"></span> min read</span></h1>, <h1 id="before-you-begin">Before You Begin</h1>]
Use decompose() to delete. 使用decompose()删除。
html='''<h1>Introduction<span class="read-time"><span class="minutes"></span> min read</span></h1>, <h1 id="before-you-begin">Before You Begin</h1>]'''
main=BeautifulSoup(html,'html.parser')
h1s = main.select('h1')
print("BEFORE: main.select('h1')", main.select('h1'))
real_h1 = h1s[0]
if real_h1.select('.read-time') is not None:
real_h1.decompose()
print("AFTER: main.select('h1')", main.select('h1'))
Output: 输出:
BEFORE: main.select('h1') [<h1>Introduction<span class="read-time"><span class="minutes"></span> min read</span></h1>, <h1 id="before-you-begin">Before You Begin</h1>]
AFTER: main.select('h1') [<h1 id="before-you-begin">Before You Begin</h1>]
.select()
returns a list. .select()
返回一个列表。 Iterate through the list and decompose
as KunduK suggested: 遍历列表并按照KunduK的建议进行
decompose
:
h1s = main.select('h1')
print("BEFORE: main.select('h1')", main.select('h1'))
real_h1 = h1s[0]
read_times = real_h1.select(".read-time")
for span in read_times:
span.decompose()
print("AFTER: main.select('h1')", main.select('h1'))
BEFORE: main.select('h1') [<h1>Introduction<span class="read-time"><span class="minutes"></span> min read</span></h1>, <h1 id="before-you-begin">Before You Begin</h1>]
AFTER: main.select('h1') [<h1>Introduction</h1>, <h1 id="before-you-begin">Before You Begin</h1>]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.