简体   繁体   English

有没有一种简单的方法来在Python中操作XML文档?

[英]Is there an easy way to manipulate XML documents in Python?

I have done a little research around the matter, but haven't really been able to come up with anything useful. 我已经围绕这个问题做了一些研究,但还没有真正能够提出任何有用的东西。 What I need is to not just parse and read, but actually manipulate XML documents in python, similar to the way JavaScript is able to manipulate HTML documents. 我需要的不仅是解析和读取,而是实际操作python中的XML文档,类似于JavaScript能够操作HTML文档的方式。

Allow me to give an example. 请允许我举个例子。 say I have the following XML document: 说我有以下XML文档:

<library>
    <book id=123>
        <title>Intro to XML</title>
        <author>John Smith</author>
        <year>1996</year>
    </book>
    <book id=456>
        <title>XML 101</title>
        <author>Bill Jones</author>
        <year>2000</year>
    </book>
    <book id=789>
        <title>This Book is Unrelated to XML</title>
        <author>Justin Tyme</author>
        <year>2006</year>
    </book>
</library>

I need a way both to retrieve an element, either using XPath or with a "pythonic" method, as outlined here , but I also need to be able to manipulate the document, such as below: 我需要一种方法既能检索元素,无论是使用XPath或用“Python化”的方法,如概括在这里 ,但我也需要能够操纵的文档,如下面:

>>>xml.getElement('id=123').title="Intro to XML v2"
>>>xml.getElement('id=123').year="1998"

If anyone is aware of such a tool in Python, please let me know. 如果有人知道Python中的这样一个工具,请告诉我。 Thanks! 谢谢!

If you want to avoid installing lxml.etree , you can use xml.etree from the standard library. 如果要避免安装lxml.etree ,可以使用标准库中的xml.etree

Here is Acorn's answer ported to xml.etree : 这是Acorn的答案移植到xml.etree

import xml.etree.ElementTree as et  # was: import lxml.etree as et

xmltext = """
<root>
    <fruit>apple</fruit>
    <fruit>pear</fruit>
    <fruit>mango</fruit>
    <fruit>kiwi</fruit>
</root>
"""

tree = et.fromstring(xmltext)

for fruit in tree.findall('fruit'): # was: tree.xpath('//fruit')
    fruit.text = 'rotten %s' % (fruit.text,)

print et.tostring(tree) # removed argument: prettyprint

note: I would have put this as a comment on Acorn's answer if I could have done so in a clear manner. 注意:如果我能以清晰的方式做到这一点,我会把它作为对Acorn答案的评论。 If you like this answer, give the upvote to Acorn. 如果您喜欢这个答案,请给予Acorn upvote。

lxml allows you to select elements using XPath, and also manipulate those elements. lxml允许您使用XPath选择元素,并且还可以操作这些元素。

import lxml.etree as et

xmltext = """
<root>
    <fruit>apple</fruit>
    <fruit>pear</fruit>
    <fruit>mango</fruit>
    <fruit>kiwi</fruit>
</root>
"""

tree = et.fromstring(xmltext)

for fruit in tree.xpath('//fruit'):
    fruit.text = 'rotten %s' % (fruit.text,)

print et.tostring(tree, pretty_print=True)

Result: 结果:

<root>
    <fruit>rotten apple</fruit>
    <fruit>rotten pear</fruit>
    <fruit>rotten mango</fruit>
    <fruit>rotten kiwi</fruit>
</root>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM