简体   繁体   English

在 python 中导入 XML 命名空间

[英]Import XML namespace in python

I'm a total noob in coding, I study IT, and have a school project in which I must convert a .txt file in a XML file.我在编码方面完全是个菜鸟,我学习 IT,并且有一个学校项目,我必须在其中将 .txt 文件转换为 XML 文件。 I have managed to create a tree, and subelements, but a must put some XML namespace in the code.我已经设法创建了一个树和子元素,但必须在代码中放入一些 XML 命名空间。 Because the XML file in the end must been opened in a program that gives you a table of the informations, and something more.因为最终必须在一个程序中打开 XML 文件,该程序为您提供信息表等等。 But without the scheme from the XML namespace it won't open anything.但是如果没有来自 XML 命名空间的方案,它就不会打开任何东西。 Can someone help me in how to put a .xsd in my code?有人可以帮助我如何将 .xsd 放入我的代码中吗?

This is the scheme: http://www.pufbih.ba/images/stories/epp_docs/PaketniUvozObrazaca_V1_0.xsd这是方案: http : //www.pufbih.ba/images/stories/epp_docs/PaketniUvozObrazaca_V1_0.xsd

Example of XML file a must create: http://www.pufbih.ba/images/stories/epp_docs/4200575050089_1022.xml必须创建的 XML 文件示例: http : //www.pufbih.ba/images/stories/epp_docs/4200575050089_1022.xml

And in the first row a have the scheme that I must input: "urn:PaketniUvozObrazaca_V1_0.xsd"在第一行有我必须输入的方案:“urn:PaketniUvozObrazaca_V1_0.xsd”

This is the code a created so far:这是迄今为止创建的代码:

import xml.etree.ElementTree as xml

def GenerateXML(GIP1022):
root=xml.Element("PaketniUvozObrazaca")
p1=xml.Element("PodaciOPoslodavcu")
root.append(p1)

jib=xml.SubElement(p1,"JIBPoslodavca")
jib.text="4254160150005"
pos=xml.SubElement(p1,"NazivPoslodavca")
pos.text="MOJATVRTKA d.o.o. ORAŠJE"
zah=xml.SubElement(p1,"BrojZahtjeva")
zah.text="8"
datz=xml.SubElement(p1,"DatumPodnosenja")
datz.text="2021-01-01"

tree=xml.ElementTree(root)
with open(GIP1022,"wb") as files:
    tree.write(files)

if __name__=="__main__":
GenerateXML("primjer.xml")

The official documentation is not super explicit as to how one works with namespaces in ElementTree, but the core of it is that ElementTree takes a very fundamental(ist) approach: instead of manipulating namespace prefixes / aliases, elementtree uses Clark's Notation .官方文档并没有特别明确说明如何使用 ElementTree 中的命名空间,但其核心是 ElementTree 采用了一种非常基本的(ist)方法: elementtree 使用Clark 的 Notation ,而不是操作命名空间前缀/别名。

So eg所以例如

<bar xmlns="foo">

or要么

<x:bar xmlns:x="foo">

(the element bar in the foo namespace) would be written foo命名空间中的元素bar )将被写入

{foo}bar
>>> tostring(Element('{foo}bar'), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

alternatively (and sometimes more conveniently for authoring and manipulating) you can use QName objects which can either take a Clark's notation tag name, or separately take a namespace and a tag name:或者(有时更方便创作和操作)您可以使用QName 对象,它可以采用克拉克符号标记名称,也可以分别采用命名空间和标记名称:

>>> tostring(Element(QName('foo', 'bar')), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

So while ElementTree doesn't have a namespace object per-se you can create namespaced object like this, probably via a helper partially applying QName:因此,虽然 ElementTree 本身没有命名空间对象,但您可以像这样创建命名空间对象,可能通过部分应用 QName 的帮助程序:

>>> root = Element(ns("PaketniUvozObrazaca"))
>>> SubElement(root, ns("PodaciOPoslodavcu"))
<Element <QName '{urn:PaketniUvozObrazaca_V1_0.xsd}PodaciOPoslodavcu'> at 0x7f502481bdb0>
>>> tostring(root, encoding='unicode')
'<ns0:PaketniUvozObrazaca xmlns:ns0="urn:PaketniUvozObrazaca_V1_0.xsd"><ns0:PodaciOPoslodavcu /></ns0:PaketniUvozObrazaca>'

Now there are a few important considerations here:现在这里有一些重要的考虑因素:

First, as you can see the prefix when serialising is arbitrary, this is in keeping with ElementTree's fundamentalist approach to XML (the prefix should not matter), but it has since grown a "register_namespace" global function which allows registering specific prefixes:首先,当序列化是任意的时,您可以看到前缀,这符合 ElementTree 对 XML 的基本主义方法(前缀应该无关紧要),但它已经发展了一个“register_namespace”全局函数,允许注册特定的前缀:

>>> register_namespace('xxx', 'urn:PaketniUvozObrazaca_V1_0.xsd')
>>> tostring(root, encoding='unicode')
'<xxx:PaketniUvozObrazaca xmlns:xxx="urn:PaketniUvozObrazaca_V1_0.xsd"><xxx:PodaciOPoslodavcu /></xxx:PaketniUvozObrazaca>'

you can also pass a single default_namespace to (some) serialization function to specify the, well, default namespace:您还可以将单个default_namespace传递给(某些)序列化函数以指定默认命名空间:

>>> tostring(root, encoding='unicode', default_namespace='urn:PaketniUvozObrazaca_V1_0.xsd')
'<PaketniUvozObrazaca xmlns="urn:PaketniUvozObrazaca_V1_0.xsd"><PodaciOPoslodavcu /></PaketniUvozObrazaca>'

A second, possibly larger, issue is that ElementTree does not support validation .第二个可能更大的问题是ElementTree 不支持验证

The Python standard library does not provide support for any validating parser or tree builder, whether DTD, rng, xml schema, anything. Python 标准库不支持任何验证解析器或树构建器,无论是 DTD、rng、xml 模式还是任何东西。 Not by default, and not optionally.不是默认的,也不是可选的。

lxml is probably the main alternative supporting validation (of multiple types of schema), its core API follows ElementTree but extends it in multiple ways and directions (including much more precise namespace prefix support, and prefix round-tripping). lxml可能是支持验证(多种类型的模式)的主要替代方案,其核心 API 遵循 ElementTree 但以多种方式和方向扩展它(包括更精确的命名空间前缀支持和前缀往返)。 But even then the validation is (AFAIK) mostly explicit, at least when generating / serializing documents.但即便如此,验证(AFAIK)大多是明确的,至少在生成/序列化文档时是这样。

The tree.write() method takes a default_namespace argument. tree.write()方法采用default_namespace参数。

What happens if you change that line to the following?如果将该行更改为以下内容会发生什么?

tree.write(files, default_namespace="urn:PaketniUvozObrazaca_V1_0.xsd")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM