简体   繁体   中英

Python: Read and write namespaced XML using ElementTree

This XML file is named example.xml :

<?xml version="1.0"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

  <modelVersion>14.0.0</modelVersion>
  <groupId>.com.foobar.flubber</groupId>
  <artifactId>uberportalconf</artifactId>
  <version>13-SNAPSHOT</version>
  <packaging>pom</packaging>
  <name>Environment for UberPortalConf</name>
  <description>This is the description</description>    
  <properties>
      <birduberportal.version>11</birduberportal.version>
      <promotiondevice.version>9</promotiondevice.version>
      <foobarportal.version>6</foobarportal.version>
      <eventuberdevice.version>2</eventuberdevice.version>
  </properties>
  <!-- A lot more here, but as it is irrelevant for the problem I have removed it -->
</project>

If I load the example.xml file above using ElementTree and print the root node:

>>> from xml.etree import ElementTree
>>> tree = ElementTree.parse('example.xml')
>>> print tree.getroot()
<Element '{http://maven.apache.org/POM/4.0.0}project' at 0x26ee0f0>

I see that Element also contains the namespace http://maven.apache.org/POM/4.0.0 .

How do I:

  1. Get the foobarportal.version text, increase it by one and write the XML file back while keeping the namespace the document had when loaded and also not change the overall XML layout.
  2. Get it to load using any namespace, not just http://maven.apache.org/POM/4.0.0 . I still don´t want to strip the namespace, as I want the XML to stay the same except for changing foobarportal.version as in 1 above.

The current way is not aware of XML but fulfills 1 and 2 above:

  1. Grep for <foobarportal.version>(.*)</foobarportal.version>
  2. Take the contents of the match group and i increase it by one
  3. Write it back.

It would be nice to have an XML aware solution, as it would be more robust. The XML namespace handling of ElementTree is making it more complicated.

如果您的问题只是:“我如何通过命名空间元素名称进行搜索”,那么答案是lxml理解{namespace}语法,因此您可以:

tree.getroot().find('{http://maven.apache.org/POM/4.0.0}project')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM