简体   繁体   English

BeautifulSoup 4 - 如何用它的文本替换锚标签

[英]BeautifulSoup 4 - How to replace anchor tag with it's text

I have html content in my document.我的文档中有 html 内容。 I need to replace all the anchor tags with their respective texts using BeautifulSoup .我需要使用BeautifulSoup将所有锚标签替换为它们各自的文本。

My input is我的输入是

html = '''They are also much more fuel-efficient than <a href="http://someurl.com">rockets</a>.'''

Expected output预期 output

"They are also much more fuel-efficient than rockets."

Here is my code这是我的代码

soup = BeautifulSoup(html, 'html.parser')
for a in soup.find_all('a'):
    ...
    replacement_string = a.string
    //I get all the anchor tags here. I need to perform the replace operation here
    ...
//Should display 'They are also much more fuel-efficient than rockets.'
print(replaced_html_string) 

I was able to replace the elements of the anchor tag but not the whole tag itself.我能够替换锚标签的元素,但不能替换整个标签本身。

You don't really need to separate all the tags out to get the text.您实际上并不需要将所有标签分开来获取文本。 just use .text :只需使用.text

soup = BeautifulSoup(html, 'html.parser')
print(soup.text)

gives:给出:

'They are also much more fuel-efficient than rockets.'

Or in your way:或者以你的方式:

res = str(soup)
for i in soup.find_all('a'):
    res = res.replace(str(i),i.text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM