[英]Parsing XML with ETREE: finding 'xl' element properties
I have the (abbreviated) XML file below ( I also changed that element names a bit to obscure the application).我在下面有(缩写的)XML 文件(我还稍微更改了该元素名称以模糊应用程序)。
<?xml version="1.0" encoding="UTF-8" ?>
<Workplace Type="PP-1"
Version="0.2"
xmlns:xl="http://www.w3.org/1999/xlink">
<Template xl:actuate="perFile"
xl:href="../templates/opt/CMPRfile"
xl:show="none"
xl:title="CPP1"
xl:type="verycomplicated"/>
<ProjectID xl:actuate="withtrain"
xl:href="filename.ppp"
xl:show="none"
xl:type="evenmorecomplicated"/>
/>
I want to parse the XML file with ETREE and find the values for the 'xl:' elements.我想用 ETREE 解析 XML 文件并找到“xl:”元素的值。 How do I do that exactly.
我该怎么做。 The do not seem to be attributes or text.
似乎不是属性或文本。 Is this some kind of special property?
这是某种特殊属性吗? I tried to find the value for 'href' for example using some code like below.
例如,我尝试使用如下代码找到“href”的值。
I tried to look up and figure out what the 'xl' labels are, but no luck.我试图查找并找出“xl”标签是什么,但没有运气。 What is also curious is when I print the attributes for the 'Workplace' node, then I get 'Type' and 'Version', but not 'xmlns'.
同样奇怪的是,当我打印“Workplace”节点的属性时,我得到的是“Type”和“Version”,但不是“xmlns”。 So, I suspect that this is somekind of special attribute?
所以,我怀疑这是某种特殊属性? This is my first time doing serious XML parsing, so Iam probably missing something here.
这是我第一次进行认真的 XML 解析,所以我可能在这里遗漏了一些东西。
I tried this:我试过这个:
xml_namespace = "{http://www.w3.org/1999/xlink}"
tree = ET.parse(project_file_name)
xml_root_element = tree.getroot()
projectid_element = xml_root_element.find(xml_namespace +
"ProjectId")
# Doesn't work
value = projectid_element.text
value = projectid_element.attrib["href"]
value = projectid_element.attrib["xl:href"]
print("Value: " + value)
And I was expecting the value 'filename.ppp'我期待值 'filename.ppp'
Edit 20221221_1557:编辑 20221221_1557:
I did look at the article mentioned by FordPerfect, but I still do not seem to be able to extract the values.我确实看过 FordPerfect 提到的文章,但我似乎仍然无法提取这些值。 I have this code:
我有这段代码:
tree, ns = parse_and_get_ns(project_file_name)
xml_root_element = tree.getroot()
print("xml_root_element type: " + str(type(xml_root_element)))
print("Namespaces found: ")
print(ns)
elements = xml_root_element.iterfind("xl:href", ns)
print("elements type: " + str(type(elements)))
for ele in elements:
print("elements ele object type: " + str(type(ele)))
and I get this as output:我得到这个作为输出:
xml_root_element type: <class 'xml.etree.ElementTree.Element'>
Namespaces found:
{'xl': '{http://www.w3.org/1999/xlink}'}
So you can see that the root element i am iterating over is indeed an element and that the final outcome does not contain any objects.所以你可以看到我正在迭代的根元素确实是一个元素,并且最终结果不包含任何对象。 However, I do expected at least 1 to be there.
但是,我确实希望至少有 1 个在那里。
After some fiddeling I found out that these elements are keys.经过一番摆弄后,我发现这些元素是关键。 You can get them by using the methode keys() for an element like so:
您可以通过对元素使用方法 keys() 来获取它们,如下所示:
xml_element = root.find(
xml_namespace + "Model_Segment"
) # One of the elements in the XML example
keys = xml_element.keys()
for key in keys:
print("key: " + key)
Output:输出:
key: {http://www.w3.org/1999/xlink}href密钥:{http://www.w3.org/1999/xlink}href
key: {http://www.w3.org/1999/xlink}label密钥:{http://www.w3.org/1999/xlink}标签
key: {http://www.w3.org/1999/xlink}role关键:{http://www.w3.org/1999/xlink}角色
key: {http://www.w3.org/1999/xlink}title键:{http://www.w3.org/1999/xlink}title
key: {http://www.w3.org/1999/xlink}type密钥:{http://www.w3.org/1999/xlink}类型
I also will mark the anwser from FordPerfect as anwser, since it contains very usefull information within the context of this question.我还将把来自 FordPerfect 的答案标记为答案,因为它在这个问题的上下文中包含非常有用的信息。
Edit: Forget what I wrote!编辑:忘记我写的东西!
Have a look at this answer:看看这个答案:
https://stackoverflow.com/a/14853417/10576322 https://stackoverflow.com/a/14853417/10576322
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.