使用lxml删除xml标签

Question

我想从文件“ new.xml”中删除xml标记，并根据打印语句放置数据。

我尝试过：

    from lxml import etree

    tree = etree.parse("C:\\Users\\name\\Desktop\\new.xml")
    root = tree.getroot()      
    for text in root.iter():
      print text.text

XML代码是：

<connection>
<rhel>

<runscript>y</runscript>
<username>useranme</username>
<password>passw</password>
<store>None</store>
<port>2</port>
<host>192.168.73.56</host>
<logdirectory>logs</logdirectory>
</rhel>

</connection>

我得到以下输出：

yes
username
passw
None
2
192.168.73.56
logs

但我想将其打印为：

is it a new connection: yes
username: username
password: passw
value: none
connections: 2
host: 192.168.73.56
log dir : logs

Answer 1

您需要根据XML文件的结构进行分析。 为此，您可以循环遍历子项，并查看每个子项的标签名称和文本。

from lxml import etree

tree = etree.parse("test.xml")
root = tree.getroot()

connections = []
for node in root.findall('rhel'): # for all the 'rhel' nodes, children of the root 'connections' node
    connections.append({info.tag: info.text for info in node}) # Construct a dictionary with the (tag, text) as (key, value) pair.

print connections

for conn in connections:
    print '='*20
    print """is it a new connection: {runscript}
username: {username}
password: {password}
value: {store}
connections: {port}
host: {host}
log dir : {logdirectory}""".format(**conn)

您可以尝试这样做： repr(root) 。 您将得到正在打印的内容。 但是不建议这样做，原因有很多：

不能保证输出的顺序与现在的顺序相同。
这不是XML文件的结构。
有很多空白行，并且预计会是这样。
那不是解析XML的方式:)

希望能帮助到你。

更新：

您可以使用connections.append(dict((info.tag, info.text) for info in node))代替Python <2.7的另一行。 我猜在此之前不支持该表示法。

或者，最终，您可以这样做：

c = {}
for info in node:
    c[info.tag] = info.text
connections.append(c)

另外，如果在Python 2.6上，我猜该格式可能也无法正常工作。 将其替换为旧的字符串格式：

    print """is it a new connection: %(runscript)s
username: %(username)s
password: %(password)s
value: %(store)s
connections: %(port)s
host: %(host)s
log dir : %(logdirectory)s""" % conn

使用lxml删除xml标签

问题描述

1 个解决方案

解决方案1
1 2012-12-31 07:50:26

使用lxml删除xml标签

问题描述

1 个解决方案

解决方案1 1 2012-12-31 07:50:26

解决方案1
1 2012-12-31 07:50:26