简体   繁体   中英

Import XML namespace in python

I'm a total noob in coding, I study IT, and have a school project in which I must convert a .txt file in a XML file. I have managed to create a tree, and subelements, but a must put some XML namespace in the code. Because the XML file in the end must been opened in a program that gives you a table of the informations, and something more. But without the scheme from the XML namespace it won't open anything. Can someone help me in how to put a .xsd in my code?

This is the scheme: http://www.pufbih.ba/images/stories/epp_docs/PaketniUvozObrazaca_V1_0.xsd

Example of XML file a must create: http://www.pufbih.ba/images/stories/epp_docs/4200575050089_1022.xml

And in the first row a have the scheme that I must input: "urn:PaketniUvozObrazaca_V1_0.xsd"

This is the code a created so far:

import xml.etree.ElementTree as xml

def GenerateXML(GIP1022):
root=xml.Element("PaketniUvozObrazaca")
p1=xml.Element("PodaciOPoslodavcu")
root.append(p1)

jib=xml.SubElement(p1,"JIBPoslodavca")
jib.text="4254160150005"
pos=xml.SubElement(p1,"NazivPoslodavca")
pos.text="MOJATVRTKA d.o.o. ORAŠJE"
zah=xml.SubElement(p1,"BrojZahtjeva")
zah.text="8"
datz=xml.SubElement(p1,"DatumPodnosenja")
datz.text="2021-01-01"

tree=xml.ElementTree(root)
with open(GIP1022,"wb") as files:
    tree.write(files)

if __name__=="__main__":
GenerateXML("primjer.xml")

The official documentation is not super explicit as to how one works with namespaces in ElementTree, but the core of it is that ElementTree takes a very fundamental(ist) approach: instead of manipulating namespace prefixes / aliases, elementtree uses Clark's Notation .

So eg

<bar xmlns="foo">

or

<x:bar xmlns:x="foo">

(the element bar in the foo namespace) would be written

{foo}bar
>>> tostring(Element('{foo}bar'), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

alternatively (and sometimes more conveniently for authoring and manipulating) you can use QName objects which can either take a Clark's notation tag name, or separately take a namespace and a tag name:

>>> tostring(Element(QName('foo', 'bar')), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

So while ElementTree doesn't have a namespace object per-se you can create namespaced object like this, probably via a helper partially applying QName:

>>> root = Element(ns("PaketniUvozObrazaca"))
>>> SubElement(root, ns("PodaciOPoslodavcu"))
<Element <QName '{urn:PaketniUvozObrazaca_V1_0.xsd}PodaciOPoslodavcu'> at 0x7f502481bdb0>
>>> tostring(root, encoding='unicode')
'<ns0:PaketniUvozObrazaca xmlns:ns0="urn:PaketniUvozObrazaca_V1_0.xsd"><ns0:PodaciOPoslodavcu /></ns0:PaketniUvozObrazaca>'

Now there are a few important considerations here:

First, as you can see the prefix when serialising is arbitrary, this is in keeping with ElementTree's fundamentalist approach to XML (the prefix should not matter), but it has since grown a "register_namespace" global function which allows registering specific prefixes:

>>> register_namespace('xxx', 'urn:PaketniUvozObrazaca_V1_0.xsd')
>>> tostring(root, encoding='unicode')
'<xxx:PaketniUvozObrazaca xmlns:xxx="urn:PaketniUvozObrazaca_V1_0.xsd"><xxx:PodaciOPoslodavcu /></xxx:PaketniUvozObrazaca>'

you can also pass a single default_namespace to (some) serialization function to specify the, well, default namespace:

>>> tostring(root, encoding='unicode', default_namespace='urn:PaketniUvozObrazaca_V1_0.xsd')
'<PaketniUvozObrazaca xmlns="urn:PaketniUvozObrazaca_V1_0.xsd"><PodaciOPoslodavcu /></PaketniUvozObrazaca>'

A second, possibly larger, issue is that ElementTree does not support validation .

The Python standard library does not provide support for any validating parser or tree builder, whether DTD, rng, xml schema, anything. Not by default, and not optionally.

lxml is probably the main alternative supporting validation (of multiple types of schema), its core API follows ElementTree but extends it in multiple ways and directions (including much more precise namespace prefix support, and prefix round-tripping). But even then the validation is (AFAIK) mostly explicit, at least when generating / serializing documents.

The tree.write() method takes a default_namespace argument.

What happens if you change that line to the following?

tree.write(files, default_namespace="urn:PaketniUvozObrazaca_V1_0.xsd")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM