In the Given HTML
below:
given = """<html>
<body>
Free Text: Above
<ul>
<li> data 1 </li>
<li>
<ul>
<li>
<ol start = "321">
<li> sub-sub list 1
<ol>
<li> sub sub sub list </li>
</ol>
</li>
<li> sub-sub list 2 </li>
</ol>
</li>
<li> sub list 2 </li>
<li> sub list 3 </li>
</ul>
</li>
<li> <p> list type paragraph </p> data 3 </li>
</ul>
Free Text: Middle
<ul>
<li> Second UL list </li>
<li> Second List part 2 </li>
</ul>
Free Text : Below
</body>
</html>"""
Now I want to ask:
How can I change the Children <li>
tags whose ANY
of the parent is
<SOME>
(please don't ask why would I want to and I won't be able to render it. I have reasons)In a nutshell, I want my above code to look like:
result = """<html> <body> Free Text: Above <ul> <li> data 1 </li> <li> <ul> <SOME> <ol start = "321"> <SOME> sub-sub list 1 <ol> <SOME> sub sub sub list </SOME> </ol> </SOME> <SOME> sub-sub list 2 </SOME> </ol> </SOME> <SOME> sub list 2 </SOME> <SOME> sub list 3 </SOME> </ul> </li> <li> <p> list type paragraph </p>data 3 </li> </ul> Free Text: Middle <ul> <li> Second UL list </li> <li> Second List part 2 </li> </ul> Free Text: Below </body> </html>"""
I tried (with and without tag.decompose
:
soup = BeautifulSoup(given, 'html.parser') for tag in soup.find_all(['li']): if tag.find_parents("li"): new_tag = soup.new_tag("SOME") new_tag.string = tag.text tag.replace_with(new_tag) result = str(soup)
but it doesn't seem to work on depth > 1
such as inner tags like sub-sub list
etc
Instead of .replace_with()
may simply rename it with .name
to keep structure:
for tag in soup.select('li li'):
tag.name = 'SOME'
from bs4 import BeautifulSoup
html = '''<html>
<body>
Free Text: Above
<ul>
<li> data 1 </li>
<li>
<ul>
<li>
<ol start = "321">
<li> sub-sub list 1
<ol>
<li> sub sub sub list </li>
</ol>
</li>
<li> sub-sub list 2 </li>
</ol>
</li>
<li> sub list 2 </li>
<li> sub list 3 </li>
</ul>
</li>
<li> <p> list type paragraph </p> data 3 </li>
</ul>
Free Text: Middle
<ul>
<li> Second UL list </li>
<li> Second List part 2 </li>
</ul>
Free Text : Below
</body>
</html>'''
soup = BeautifulSoup(html)
for tag in soup.select('li li'):
tag.name = 'SOME'
soup
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.