简体   繁体   English

使用etree查找xml文件中内部文本的所有出现

[英]Find all occurences of inner text in an xml file with etree

I am new to python and trees at all, and have encountered some problems. 我是python和树的新手,遇到了一些问题。

I have the following dataset structured as: 我有以下数据集结构:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
  <node id="node1">
    <data key="label">node1</data>
    <data key="degree">6</data>
  </node>
  <node id="node2">
    <data key="label">node2</data>
    <data key="degree">32</data>
  </node>
  <node id="node3">
    <data key="label">node3</data>
    <data key="degree">25</data>
  </node>
</graphml>

I wish to be able to reach and print all the inner text of the < data key="label"> elements using etree. 我希望能够使用etree访问和打印<data key =“label”>元素的所有内部文本。 In other words get the following result: 换句话说,得到以下结果:

"node1"
"node2"
"node3"

I have tried the itertext() with no luck ( https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.itertext ), as well as faulty xpath expressions. 我尝试了没有运气的itertext()( https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.itertext ),以及错误的xpath表达式。

I'm sure there is a simple solution to this, hope you guys can help! 我相信有一个简单的解决方案,希望你们能帮忙!

You probably forget the namespace. 您可能忘记了命名空间。 Try something like this: 尝试这样的事情:

import xml.etree.ElementTree as ET

root = ET.fromstring(countrydata)

ns = {'graphml': 'http://graphml.graphdrawing.org/xmlns'}

for element in root.findall(".//graphml:node[@id]",ns):
    print(element.attrib['id'])

This does the job on python 2.7 : 这在python 2.7上完成了这项工作:

import xml.etree.ElementTree as ET
root = ET.fromstring(data)

elts = root.findall('.//*[@key="label"]')
for e in elts:
    print(e.text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM