简体   繁体   中英

Python get tag with certain text

I've string with html blocks, like

a = '<div>Test moree test <div> London is ... <p>mooo</p></div></div>'

I need get block with certain text, for example

super_func("London", a) ==> '<div> London is ... <p>mooo</p></div>'
super_func('mooo', a) = '<p>mooo</p>'

You can use the following XPath query to find an element containing certain text, regardless the element name and it's location within the HTML document :

//*[contains(text(),'certain text')]

This is a working example using lxml.html library :

from lxml import html

def super_func(keyword, htmldoc):
    query = '//*[contains(text(),"{0}")]'
    result = htmldoc.xpath(query.format(keyword))
    if len(result) > 0:
        return html.tostring(result[0])
    else:
        return ''

a = '<div>Test moree test <div> London is ... <p>mooo</p></div></div>'
doc = html.fromstring(a)
text = 'London'
print super_func(text, doc)
text = 'mooo'
print super_func(text, doc)

output :

<div> London is ... <p>mooo</p></div>
<p>mooo</p>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM