I have a HTML snippet that looks like this:
<pre>zdfsfsf<br/>adfadfadf
adfadfasdfadfad adfadf adf
Mill Valley, CA 94941
122-2323-24124
Email: adfadfadf<br/><i>sfsfsfsf</i></pre>
<br/>
I want to strip all tags and just have the text.
Content should look like this:
zdfsfsf adfadfadf
adfadfasdfadfad adfadf adf
Mill Valley, CA 94941
122-2323-24124
Email: adfadfadf sfsfsfsf
I'm looking for something like this:
cells = row.find_all('td')
for c in cells:
c.STRIP_HTML_TAGS()?????? <--WHAT IS THIS FUNCTION?
You're looking for get_text()
:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("""<pre>zdfsfsf<br/>adfadfadf
... adfadfasdfadfad adfadf adf
... Mill Valley, CA 94941
... 122-2323-24124
... Email: adfadfadf<br/><i>sfsfsfsf</i></pre>
... <br/>""")
>>> print(soup.get_text())
zdfsfsfadfadfadf
adfadfasdfadfad adfadf adf
Mill Valley, CA 94941
122-2323-24124
Email: adfadfadfsfsfsfsf
>>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.