getting the node attribute of an XML file with LXML parsing

Question

I cant get my mind around this nor working properly:

data='''<?xml version="1.0" encoding="UTF-8"?>\n<div type="docs" xml:base="/kime-api/prod/api/emi/2" xml:lang="ja" xml:id="39532e30"> <div n="0001" type="doc" xml:id="_5738d00002"></div></div>'''

parser = etree.XMLParser(resolve_entities=False, strip_cdata=False, recover=True, ns_clean=True)
 
# I tried with and without this following line
#data = data.replace('<?xml version="1.0" encoding="UTF-8"?>','')

XML_tree = etree.fromstring(data.encode() , parser=parser)
lang = XML_tree.xpath('.//div[@xml:lang]')
lang

lang is an empty list and there is ONE element like: xml:lang="ja" in the XML.

What am I doing wrong please?

Answer 1

XML_tree represents the root element (the <div> with an xml:lang attribute).

If you want to get the language, use the following:

lang = XML_tree.xpath('@xml:lang')

Answer 2

You could just do xpath(@xml:lang) .

XML_tree = etree.fromstring(data.encode() , parser=parser)
lang = XML_tree.xpath('@xml:lang')
print(lang)

Output:

['ja']

getting the node attribute of an XML file with LXML parsing

Question

2 answers

solution1
0 2021-06-24 08:29:27

solution2
0 ACCPTED 2021-06-24 08:34:51

getting the node attribute of an XML file with LXML parsing

Question

2 answers

solution1 0 2021-06-24 08:29:27

solution2 0 ACCPTED 2021-06-24 08:34:51

solution1
0 2021-06-24 08:29:27

solution2
0 ACCPTED 2021-06-24 08:34:51