I would like to get all of the displayed text on an HTML page up until a certain tag is hit. For example, I would like to get all of the displayed text on a page up until a tag with the id "end_content" is hit.
Is there a way to do this with BeautifulSoup? This would be similar to the soup.get_text() method, except it would just stop fetching text after it hits a tag with the id "end_content".
I would do the following:
html = (
'<h1>HEY!</h1>'
'<div>'
'How are'
'<h2>you?</h2>'
'<div id="end_content">END</div>'
'</div>'
'Some other text'
)
soup = BeautifulSoup(html, 'lxml')
>>> soup.select_one('#end_content').find_all_previous(string=True)[::-1]
['HEY!', 'How are', 'you?']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.