简体   繁体   English

如何使用BeautifulSoup和Python删除仅包含空格的HTML标签

[英]How do I delete an HTML tag that contains whitespace only using BeautifulSoup and Python

I have been trying to scrape some HTML and extract certain texts from it. 我一直在尝试抓取一些HTML并从中提取某些文本。

The HTML has tags that are empty or tags that only contain whitespace. HTML具有为空的标记或仅包含空格的标记。

How can I get rid of all those tags from my tree? 如何摆脱树上所有这些标签? I am using beautiful soup and python. 我正在使用漂亮的汤和蟒蛇。

You can use decompose() function to do this. 您可以使用decompose()函数执行此操作。

markup = '<a href="http://example.com/">I linked to <i>example.com</i></a>'
soup = BeautifulSoup(markup)
a_tag = soup.a

soup.i.decompose()

a_tag
# <a href="http://example.com/">I linked to</a>

You will need to loop over the tags though and find out the tags that have empty content and then use the function above to delete it from your tree. 但是,您将需要遍历标签,找出内容为空的标签,然后使用上面的函数将其从树中删除。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Python 找到 BeautifulSoup 的下一个标签/元素? - How do I find BeautifulSoup next tag/element using Python? 标签包含文本,还有另一个标签包含文本。 如何获取文本,而不是带有 beautifulsoup python 的额外标签中的文本? - Tag contains text, but also another tag with text. How do I get the text, but not the text within the extra tag with beautifulsoup python? 如何使用 BeautifulSoup 找到标签? - How do I find a tag using BeautifulSoup? 如何使用BeautifulSoup在python中用字符串替换HTML内容? - How do I replace HTML content with a string in python using BeautifulSoup? 如何仅获取 HTML 树的一部分,该部分位于带有特定字符串 BeautifulSoup 的特定标签之上? - How do I get only the part of an HTML tree which is above a certain tag with certain string BeautifulSoup? 我无法使用beautifulsoup python获取HTML标签的值 - I can't get a value of HTML tag using beautifulsoup python 更新文字 <p> HTML中包含的标记 <img> 使用Beautifulsoup标记 - Update text in a <p> tag in HTML that contains <img> tag using Beautifulsoup 如何使用BeautifulSoup匹配仅包含所述类而不包含任何其他类的标签? - How do I match a tag containing only the stated class, not any others, using BeautifulSoup? 如何在 BeautifulSoup 中检索 html 标签的一部分? - How do I retrieve a part of the html tag in BeautifulSoup? 如何使用beautifulsoup在python中按类查找html标签 - How to find a html tag by class in python using beautifulsoup
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM