简体   繁体   English

如何在文本中用标签的值替换标签

[英]How to replace tag with it's value within text

How do I extract 我如何提取

I love Python 我爱Python

from given HTML 从给定的HTML

I <img src="image.png" alt="love"> Python

Getting string and splitting it won't work, text is controlled by user and might contain <> 获取字符串并将其拆分将不起作用,文本由用户控制,并且可能包含<>

There are a few different ways to achieve that. 有几种不同的方法可以实现这一目标。 One way to do that would be to find all img elements and replace them with a text node containing the alt value of the img element: 一种方法是找到所有img元素,并包含img元素的alt值的文本节点替换它们

In [1]: from bs4 import BeautifulSoup

In [2]: data = """<div class="commentthread_comment_text">I <img src="image.png" alt="love"> Python</div>"""

In [3]: soup = BeautifulSoup(data, "html.parser")

In [4]: div = soup.find('div', {'class': 'commentthread_comment_text'})

In [5]: for img in div('img'):
    ...:     img.replace_with(img['alt'])
    ...:     

In [6]: div.get_text()
Out[6]: 'I love Python'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM