Trying to use beautifulsoup to make changes to a html file. I want to add a new line after every bullet point in the div class below. I have already tried text.replace function (using '\\n') but it doesn't work outside of the terminal since html only creates new line with a br tag. Is there a way to insert a line break at the end of every bullet point?
HTML code:
<div class="recipe"> ■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve</div>
When I view it on a webpage it currently looks like this:
■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve
I would like it to look like this:
■ Boil water to high heat
■ Put eggs in water
■ Put on lid
■ Wait 8 - 12 minutes
■ Take out eggs
■ Serve
Code I used to add a new line (only works with print function). Without the print function, it just replaces all the '■' with '\\n■' without making a new line in the html file.
for div in soup.find_all("div", {'class':'recipe'}):
print(div.text.replace('■','\n■'))
Try:
from bs4 import BeautifulSoup
html_doc = """
<div class="recipe">
■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve
</div>"""
soup = BeautifulSoup(html_doc, "html.parser")
recipe = soup.find(class_="recipe")
t = BeautifulSoup(
"<br />■ ".join(recipe.get_text(strip=True).split("■")).strip("<br />"),
"html.parser",
)
recipe.string.replace_with(t)
print(soup.prettify())
This will create <br />
after each ■
item (screenshot from Firefox):
HTML:
<div class="recipe">
■ Boil water to high heat
<br/>
■ Put eggs in water
<br/>
■ Put on lid
<br/>
■ Wait 8 - 12 minutes
<br/>
■ Take out eggs
<br/>
■ Serve
</div>
EDIT: To save the soup
to HTML file:
with open("page.html", "w") as f_out:
f_out.write(str(soup))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.