简体   繁体   中英

lxml: Converting XML to HTML through XSLT and get HtmlElements

I have data that comes as an XML file. I have also been provided an XSLT to transform the XML to HTML. I can use lxml to perform the conversion, however, I want to alter some of the HTML tags after the transformation. How do I convert this new etree into HtmlElements so that I can specifically use certain methods like .cssselect() and so on.

>>> import lxml.etree
>>> import lxml.html
>>>
>>> xmlstring = '''\
... <?xml version='1.0' encoding='ASCII'?>
... <root><a class="here">link1</a><a class="there">link2</a></root>
... '''
>>> root = lxml.etree.fromstring(xmlstring)
>>> root.cssselect('a.here')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'lxml.etree._Element' object has no attribute 'cssselect'

lxml.etree.tostring(root) -> lxml.html.fromstring(..)

>>> root = lxml.html.fromstring(lxml.etree.tostring(root))
>>> root.cssselect('a.here')
[<Element a at 0x2989308>]

Get XML output:

>>> print lxml.etree.tostring(root, xml_declaration=True)
<?xml version='1.0' encoding='ASCII'?>
<root><a class="here">link1</a><a class="there">link2</a></root>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM