简体   繁体   English

具有xml属性的Python XML解析

[英]Python XML Parse with xml attributes

I have many rows in a file that contains xml and I'm trying to write a Python script that will go through those rows and count how many instances of a particular node attribute show up. 我在一个包含xml的文件中有很多行,并且我正在尝试编写一个Python脚本,该脚本将遍历这些行并计算出现特定节点属性的实例数量。 For instance, my tree looks like: 例如,我的树看起来像:

 <foo> <bar> <type name="controller">A</type> <type name="channel">12</type> </bar> </foo> 

I want to get text of line with 'name="controller"'. 我想获取带有'name =“ controller”'的行文本。 In the above xml text, I need to receive "A" and not "controller". 在上面的xml文本中,我需要接收“ A”而不是“ controller”。

I used xml.etree.ElementTree but it shows me the value of name attribute that is "controller". 我使用了xml.etree.ElementTree但它向我显示了名称属性的值“控制器”。

For xml.etree.ElementTree , use the text property of an Element to get the text inside the element - 对于xml.etree.ElementTree ,请使用Elementtext属性来获取Element内部的文本-

Example - 范例-

import xml.etree.ElementTree as ET
x = ET.fromstring('<a>This is the text</a>')
x.text
>> 'This is the text'

Assuming your file is input.xml . 假设您的文件是input.xml You can use the following piece of code : 您可以使用以下代码:

import xml.etree.ElementTree as ET

tree = ET.parse('input.xml')
tree_ = tree.findall('bar')

for i in tree_:
    i_ = i.findall('type')

    for elem in i_:
        if elem.attrib['name'] == 'controller':
            print elem.text

ElementTree supports some limited XPath (XPath is a language for specifying nodes in an xml file). ElementTree支持某些有限的XPath(XPath是用于在xml文件中指定节点的语言)。 We can use this to find all of your desired nodes and the text attribute to get their content. 我们可以使用它来查找所有所需的节点和text属性以获取其内容。

import xml.etree.ElementTree as ET

tree = ET.parse("filename.xml")

for x in tree.findall(".//type[@name='controller']"):
    print(x.text)

This will loop over all type elements whose name attribute is controller . 这将遍历名称属性为controller的所有类型元素。 In XPath the .// means all descendents of the current node and the name type means just those whose tag is type. 在XPath中,。//表示当前节点的所有后代,名称类型表示仅其标签为type的那些后代。 The bracket is a predicate expression which means only nodes satisfiing a condition. 方括号是谓词表达式,表示仅满足条件的节点。 @name means the name attribute. @name表示名称属性。 Thus this expression means to select all type nodes (no matter how deep) with a name attribute equal to controller. 因此,此表达式意味着选择名称属性等于controller的所有类型节点(无论深度如何)。

In this example, I have just printed the text in the node. 在此示例中,我刚刚在节点中打印了文本。 You can do whatever you want in the body of that loop. 您可以在该循环的主体中执行任何所需的操作。

If you want all nodes with that attribute and not just the type nodes, replace the argument to the findall function with 如果要所有具有该属性的节点,而不仅仅是类型节点,则将findall函数的参数替换为

.//*[@name='controller']

The * matches ANY element node. *匹配ANY元素节点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM