[英]Remove portion of html (tag) keeping style - python
I would like remove a portion of an html that contains a specific string before saving it. 我想在保存之前删除包含特定字符串的html的一部分。 The tag contains a person's Name and I would like to remove the entire tag so as to make it anonymous.
标签包含一个人的名字,我想删除整个标签以使其匿名。
The html is: 的HTML是:
<div id="top-card" data-li-template="top_card">...</div>
and all its children. 还有它的所有孩子
I explored using beautifulsoup but could not find a solution. 我探索使用beautifulsoup,但找不到解决方案。
Is there a way that I can just remove the entire "portion" of the html while keeping the style intact? 有没有一种方法可以在保持样式不变的情况下删除html的整个“部分”?
Thanks! 谢谢!
You can use .extract()
to remove elements from using BeautifulSoup
. 您可以使用
.extract()
从使用BeautifulSoup
删除元素。
Assuming you want to remove the div whose id is "top-card": 假设您要删除ID为“ top-card”的div:
>>> html = """
... <div id="top-card" data-li-template="top_card"><div>test</div></div>
... <div>test</div> <div id="foo">blah</div>"""
>>> soup = BeautifulSoup(html)
>>> [div.extract() for div in soup("div",id="top-card")]
[<div data-li-template="top_card" id="top-card"><div>test</div></div>]
>>> soup
<html><body>
<div>test</div> <div id="foo">blah</div></body></html>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.