简体   繁体   English

Beautifulsoup - 添加<br> div 文本中的标签?

[英]Beautifulsoup - add <br> tag in div text?

Trying to use beautifulsoup to make changes to a html file.尝试使用 beautifulsoup 更改 html 文件。 I want to add a new line after every bullet point in the div class below.我想在下面 div 类中的每个项目符号点之后添加一个新行。 I have already tried text.replace function (using '\\n') but it doesn't work outside of the terminal since html only creates new line with a br tag.我已经尝试过 text.replace 函数(使用 '\\n'),但它在终端之外不起作用,因为 html 只创建带有 br 标签的新行。 Is there a way to insert a line break at the end of every bullet point?有没有办法在每个项目符号的末尾插入换行符?

HTML code: HTML代码:

<div class="recipe"> ■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve</div>

When I view it on a webpage it currently looks like this:当我在网页上查看它时,它目前看起来像这样:
■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve ■ 将水煮沸 ■ 将鸡蛋放入水中 ■ 盖上盖子 ■ 等待 8 - 12 分钟 ■ 取出鸡蛋 ■ 上桌

I would like it to look like this:我希望它看起来像这样:
■ Boil water to high heat ■ 将水烧开至高温
■ Put eggs in water ■ 将鸡蛋放入水中
■ Put on lid ■ 盖上盖子
■ Wait 8 - 12 minutes ■ 等待 8 - 12 分钟
■ Take out eggs ■ 取出鸡蛋
■ Serve ■ 服务

Code I used to add a new line (only works with print function).我用来添加新行的代码(仅适用于打印功能)。 Without the print function, it just replaces all the '■' with '\\n■' without making a new line in the html file.如果没有打印功能,它只是将所有的 '■' 替换为 '\\n■' 而不会在 html 文件中换行。

for div in soup.find_all("div", {'class':'recipe'}): 
    print(div.text.replace('■','\n■'))

Try:尝试:

from bs4 import BeautifulSoup

html_doc = """
<div class="recipe">
    ■ Boil water to high heat ■ Put eggs in water ■ Put on lid ■ Wait 8 - 12 minutes ■ Take out eggs ■ Serve
</div>"""

soup = BeautifulSoup(html_doc, "html.parser")
recipe = soup.find(class_="recipe")

t = BeautifulSoup(
    "<br />■ ".join(recipe.get_text(strip=True).split("■")).strip("<br />"),
    "html.parser",
)
recipe.string.replace_with(t)

print(soup.prettify())

This will create <br /> after each item (screenshot from Firefox):这将在每个项目之后创建<br /> (来自 Firefox 的截图):

在此处输入图片说明

HTML: HTML:

<div class="recipe">
 ■  Boil water to high heat
 <br/>
 ■  Put eggs in water
 <br/>
 ■  Put on lid
 <br/>
 ■  Wait 8 - 12 minutes
 <br/>
 ■  Take out eggs
 <br/>
 ■  Serve
</div>

EDIT: To save the soup to HTML file:编辑:将soup保存到 HTML 文件:

with open("page.html", "w") as f_out:
    f_out.write(str(soup))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM