简体   繁体   中英

What is the modern way to to do XML transforms in Python?

I need to do bi-direction transforms between some significantly complex XML and flat-file formats in Python. I'm out of date and don't know how people are solving this problem in the far off future year of 2011.

I've got back up to date with the various Python XML libraries, but it's been 8 years since my last time in XSLT hell and I was amazed after googling around that that's still common.

So how do you go about doing complex XML data transforms?

I'd like to do this in Python because the documents are not direct mappings and there is some processing and calculation required. But I'd still like to pass as much off to a rule engine as possible.

Edit : To be clear I'm interested in techniques more so than specific libraries or tools, but please to post those too. I'm trying hard to avoid the word pattern here, but surely this is a common problem.

Edit 2 : I still don't think there were any good answers on general techniques, but the original problem I had was solved using the Bots EDI framework for doing document translations. It is quite heavily focussed on EDI but can be used for generic translation. It was a heavy-weight solution though.

Although it's only useful for writing XML, XMLwitch is freakin' amazing. For doing non-XML to XML transformations, I highly recommend it!

For python, here is a comprehensive list of available XML libs/modules:

http://wiki.python.org/moin/PythonXml

If you are looking for something simpler than XSLT, XMLStarlet is a set of command line tools which may be of interest for you:

http://xmlstar.sourceforge.net/

As any command line tool, this is not especially made for Python, but can be easily integrated into a python script.

虽然它也只是一个库 - 或者说是一个框架, 但是inxs是一种通过Python进行基于规则的转换来避免XSLT地狱的方法。

Hi Dimitre I use lxml to manipulate xml in python.

I have posted some techniques for managing large amounts of xml schemas, namespaces etc.. Automatic XSD validation

One suggestion is to try to use the full xpath when possible.

For example if i had a complex type:

<Person>
 <name/>
 <age/>
</person>

i could run into the problem if i dont use /Person/name and later that complex type changes to:

<person>
 <name/>
 <age/>
 <child>
   <son>
     <name/>
     <age/>
   </son>
</child>
</person>

The reason being now 'name' exists in multiple places.

Also be aware of schemas that allow for many "persons" in this example. you may need to provide a "key" with your xpath to determine which person you are referencing. you could have 5 or 6 Persons in your xml the xpaths would be identical but the names uniqe, the name would then be your key to use to reference each particular person.

I would also suggest writing your own wrapper functions around lxml that suit your needs. What i did was I created an xmlUtil.py file that contained xml generic functions that i needed. I then created a myXML.py file that had assumptions about my specific xml and behavior. the xmlUtil.py functions only accept the xml content ( this is in case i decide to use something instead of lxml it will be easy to change).

Hope some of this helps. wish i could be more helpful but question is very open ended.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM