[英]BeautifulSoup uncomment a comment by id
我想使用 BeautifulSoup 将下面的 html 更改为通过评论标签 id 取消评论。
<div class="foo">
cat dog sheep goat
<!--<p id="p1">test</p>-->
<p id="p2">
test
</p>
</div>
这是我的预期结果:
<div class="foo">
cat dog sheep goat
<p id="p1">test</p>
<p id="p2">
test
</p>
</div>
这是我的 python 代码我使用 BeautifulSoup,但我不知道如何完成这个 function。
from bs4 import BeautifulSoup,Comment
data = """<div class="foo">
cat dog sheep goat
<p id='p1'>test</p>
<p id='p2'>test</p>
</div>"""
soup = BeautifulSoup(data, 'html.parser')
for comment in soup(text=lambda text: isinstance(text, Comment)):
if 'id="p1"' in comment.string:
# I don't know how to complete it here.
# This is my incorrect solution
# It will output "<p id="p1">test</p>",
# not "<p id='p1'>test</p>"
comment.replace_with(comment.string.replace("<!--", "").replace("-->", ""))
break
请求帮忙
您可以将新汤放入.replace_with()
而不是字符串:
from bs4 import BeautifulSoup,Comment
data = """<div class="foo">
cat dog sheep goat
<!--<p id="p1">test</p>-->
<p id="p2">
test
</p>
</div>"""
soup = BeautifulSoup(data, 'html.parser')
print('Original soup:')
print('-' * 80)
print(soup)
print()
for comment in soup(text=lambda text: isinstance(text, Comment)):
if 'id="p1"' in comment.string:
tag = BeautifulSoup(comment, 'html.parser')
comment.replace_with(tag)
break
print('New soup:')
print('-' * 80)
print(soup)
print()
印刷:
Original soup:
--------------------------------------------------------------------------------
<div class="foo">
cat dog sheep goat
<!--<p id="p1">test</p>-->
<p id="p2">
test
</p>
</div>
New soup:
--------------------------------------------------------------------------------
<div class="foo">
cat dog sheep goat
<p id="p1">test</p>
<p id="p2">
test
</p>
</div>
您是否考虑过只使用正则表达式而不是 bs4?
也许这可以让你开始。
>>> re.search("<!--((.*)p1(.*))-->", '<!--<p id="p1">test</p>-->')
<re.Match object; span=(0, 26), match='<!--<p id="p1">test</p>-->'>
>>> re.search("<!--((.*)p1(.*))-->", '<!--<p id="p1">test</p>-->').group(1)
'<p id="p1">test</p>'
>>> regex = re.compile("<!--((.*)p1(.*))-->")
>>> regex.sub('<p id="p1">test</p>', '<!--<p id="p1">test</p>-->')
'<p id="p1">test</p>'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.