简体   繁体   中英

Replace “\n” in element string with <br> tag in Beautifulsoup

I am creating a new tag and assigning a string with a newline

from bs4 import BeautifulSoup

soup = BeautifulSoup("", "html.parser")

myTag = soup.new_tag("div")
myTag.string = "My text \n with a new line"

soup.insert(0, myTag)

and the result is

<div>My text 
 with a new line</div>

as expected. However the newlines need the <br> tag in order to be rendered correctly.

How can I achieve this?

I think it might be better to set the CSS white-space property to pre-wrap on that div:

pre-wrap -- Whitespace is preserved by the browser. Text will wrap when necessary, and on line breaks.

An example:

<div style="white-space:pre-wrap"> Some \n text here </div>

And the code to do that in BeautifulSoup:

myTag = soup.new_tag("div", style="white-space:pre-wrap")
myTag.string = "My text \n with a new line"

Seems that replacing the \\n is not trivial since BeautifulSoup will escape the HTML entities by default. An alternative is to split the input string and build up the tag structure with text and <br> tags on your own:

def replace_newline_with_br(s, soup):
    lines = s.split('\n')
    div = soup.new_tag('div')
    div.append(lines[0])
    for l in lines[1:]:
        div.append(soup.new_tag('br'))
        div.append(l)
    soup.append(div)

mytext = "My text with a few \n newlines \n"
mytext2 = "Some other text \n with a few more \n newlines \n here"

soup = BeautifulSoup("", )
replace_newline_with_br(mytext, soup)
replace_newline_with_br(mytext2, soup)
print soup.prettify()     

Prints:

<div>
 My text with a few
 <br/>
 newlines
 <br/>
</div>
<div>
 Some other text
 <br/>
 with a few more
 <br/>
 newlines
 <br/>
 here
</div>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM