在python中处理来自XML标签的数据

Question

I am trying to extract data from an XML document using python. 我正在尝试使用python从XML文档中提取数据。

The tool I'm currently trying with and seems like it is a stable choice is lxml . 我目前正在使用的工具，似乎是一个稳定的选择，是lxml 。

The issue I'm having is that the tutorials and questions I have came across all assume the format of the XML document is as follows: 我遇到的问题是，我遇到的所有教程和问题都假设XML文档的格式如下：

<note> 
   <to>Tove</to> 
   <from>Jani</from> 
   <heading>Reminder</heading> 
   <body>Don't forget me this weekend!</body> 
</note>

With the values inside the XML tags. 带有XML标记内的值。

However - the document I am trying to extract from has values inside elements of the tags, like so: 但是-我试图从中提取的文档的标签元素内部具有值，如下所示：

<note> 
   <to id="16" name="Tove"/>
   <from id="341" name"Jani"/> 
   <heading id="1" name="Reminder"/> 
   <body id="2" name="Don't forget me this weekend!"/> 
</note>

The way I have tried doing this in LXML is this: 我尝试在LXML中执行此操作的方式是：

xml_file = lxml.etree.parse("test.xml")

notes = xml_file.xpath("//note")

for note in notes:
    note_id = note.find("id").text
    print note_id

This just returns "None" 这只会返回“无”

I have now found that the .text is what gets data from inside the XML tags - However I simply can't find how to get the data from the elements shown above. 现在，我发现.text是从XML标记内部获取数据的东西-但是，我根本无法找到如何从上面显示的元素获取数据的方法。

Could anyone point me in the right direction? 有人能指出我正确的方向吗？

Answer 1

To access the attributes you should use an attrib : 要访问属性，您应该使用attrib ：

xml_file = lxml.etree.parse("test.xml")

notes = xml_file.xpath("//note")

for note in notes:
    print [ x.attrib for x in note.getchildren() ]

More reading: http://lxml.de/tutorial.html#elements-carry-attributes-as-a-dict 详细阅读： http : //lxml.de/tutorial.html#elements-carry-attributes-as-a-dict

在python中处理来自XML标签的数据

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-10-25 20:43:26

在python中处理来自XML标签的数据

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-10-25 20:43:26

解决方案1
1 已采纳 2017-10-25 20:43:26