简体   繁体   中英

Find break lines in html using python

I'm trying to find all line breaks (br) but also change of line when within paragraphs (p) of an html website. I have this code

breaks = re.findall('br<>\n', html)
print len(breaks)

but it's not working. Any help

I'm not entirely sure what you want, since you don't show example inputs and outputs.

But if you are looking to split after an <br> tag OR an newline, you could try this:

# included many vairations of the '<br>' tag
breaks = re.findall('<br>|<br/>|<br />|\n', html)
print len(breaks)

Does that help?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM