简体   繁体   English

删除html(tag)保留样式的部分-python

[英]Remove portion of html (tag) keeping style - python

I would like remove a portion of an html that contains a specific string before saving it. 我想在保存之前删除包含特定字符串的html的一部分。 The tag contains a person's Name and I would like to remove the entire tag so as to make it anonymous. 标签包含一个人的名字,我想删除整个标签以使其匿名。

The html is: 的HTML是:

<div id="top-card" data-li-template="top_card">...</div>

and all its children. 还有它的所有孩子

I explored using beautifulsoup but could not find a solution. 我探索使用beautifulsoup,但找不到解决方案。

Is there a way that I can just remove the entire "portion" of the html while keeping the style intact? 有没有一种方法可以在保持样式不变的情况下删除html的整个“部分”?

Thanks! 谢谢!

You can use .extract() to remove elements from using BeautifulSoup . 您可以使用.extract()从使用BeautifulSoup删除元素。

Assuming you want to remove the div whose id is "top-card": 假设您要删除ID为“ top-card”的div:

>>> html = """
... <div id="top-card" data-li-template="top_card"><div>test</div></div>
... <div>test</div> <div id="foo">blah</div>"""
>>> soup = BeautifulSoup(html)
>>> [div.extract() for div in soup("div",id="top-card")]
[<div data-li-template="top_card" id="top-card"><div>test</div></div>]
>>> soup
<html><body>
<div>test</div> <div id="foo">blah</div></body></html>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM