删除html（tag）保留样式的部分-python

Question

I would like remove a portion of an html that contains a specific string before saving it. 我想在保存之前删除包含特定字符串的html的一部分。 The tag contains a person's Name and I would like to remove the entire tag so as to make it anonymous. 标签包含一个人的名字，我想删除整个标签以使其匿名。

The html is: 的HTML是：

<div id="top-card" data-li-template="top_card">...</div>

and all its children. 还有它的所有孩子

I explored using beautifulsoup but could not find a solution. 我探索使用beautifulsoup，但找不到解决方案。

Is there a way that I can just remove the entire "portion" of the html while keeping the style intact? 有没有一种方法可以在保持样式不变的情况下删除html的整个“部分”？

Thanks! 谢谢！

Answer 1

You can use .extract() to remove elements from using BeautifulSoup . 您可以使用.extract()从使用BeautifulSoup删除元素。

Assuming you want to remove the div whose id is "top-card": 假设您要删除ID为“ top-card”的div：

>>> html = """
... <div id="top-card" data-li-template="top_card"><div>test</div></div>
... <div>test</div> <div id="foo">blah</div>"""
>>> soup = BeautifulSoup(html)
>>> [div.extract() for div in soup("div",id="top-card")]
[<div data-li-template="top_card" id="top-card"><div>test</div></div>]
>>> soup
<html><body>
<div>test</div> <div id="foo">blah</div></body></html>

删除html（tag）保留样式的部分-python

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-07-02 12:03:37

删除html（tag）保留样式的部分-python

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-07-02 12:03:37

解决方案1
1 已采纳 2015-07-02 12:03:37