简体   繁体   English

如何在 Python 中编写没有标题的 XML 文件?

[英]How to write an XML file without header in Python?

when using Python's stock XML tools such as xml.dom.minidom for XML writing, a file would always start off like当使用 Python 的xml.dom.minidom XML 工具(例如xml.dom.minidom进行 XML 编写)时,文件总是像这样开始

<?xml version="1.0"?>

[...]

While this is perfectly legal XML code, and it's even recommended to use the header, I'd like to get rid of it as one of the programs I'm working with has problems here.虽然这是完全合法的 XML 代码,甚至建议使用标头,但我想摆脱它,因为我正在使用的程序之一在这里有问题。

I can't seem to find the appropriate option in xml.dom.minidom , so I wondered if there are other packages which do allow to neglect the header.我似乎无法在xml.dom.minidom找到合适的选项,所以我想知道是否还有其他包允许忽略标题。

Cheers,干杯,

Nico妮可

Unfortunately minidom does not give you the option to omit the XML Declaration.不幸的是minidom没有给你省略 XML 声明的选项。

But you can always serialise the document content yourself by calling toxml() on the document's root element instead of the document .但是,你可以随时调用自己连载的文档内容toxml()文件的根元素,而不是在document Then you won't get an XML Declaration:然后你不会得到一个 XML 声明:

xml= document.documentElement.toxml('utf-8')

...but then you also wouldn't get anything else outside the root element, such as the DOCTYPE, or any comments or processing instructions. ...但是你也不会得到根元素之外的任何东西,比如 DOCTYPE,或者任何注释或处理指令。 If you need them, serialise each child of the document object one by one:如果需要,将文档对象的每个子项一一序列化:

xml= '\n'.join(node.toxml('utf-8') for node in document.childNodes)

I wondered if there are other packages which do allow to neglect the header.我想知道是否还有其他软件包可以忽略标题。

DOM Level 3 LS defines an xml-declaration config parameter you can use to suppress it. DOM Level 3 LS 定义了一个xml-declaration配置参数,您可以使用它来抑制它。 The only Python implementation I know of is pxdom , which is thorough on standards support, but not at all fast.我所知道的唯一 Python 实现是pxdom ,它在标准支持方面非常全面,但速度并不快。

If you want to use minidom and maintain 'prettiness', how about this as a quick/hacky fix:如果您想使用 minidom 并保持“漂亮”,那么将其作为快速/hacky 修复如何:

xml_without_declaration.py : xml_without_declaration.py :

import xml.dom.minidom as xml

doc = xml.Document()

declaration = doc.toxml()

a = doc.createElement("A")
doc.appendChild(a)
b = doc.createElement("B")
a.appendChild(b)

xml = doc.toprettyxml()[len(declaration):]

print xml

The header is print in Document .标题打印在Document If you print the node directly, it won't print the header.如果直接打印节点,则不会打印标题。

root = doc.childNodes[0]
root.toprettyxml(encoding="utf-8")

Just replace the first line with blank:只需将第一行替换为空白:

import xml.dom.minidom as MD     
<XML String>.replace(MD.Document().toxml()+'\n', '') 

Purists may not like to hear this, but I have found using an XML parser to generate XML to be overkill.纯粹主义者可能不喜欢听到这个,但我发现使用 XML 解析器生成 XML 有点过头了。 Just generate it directly as strings.只需直接将其生成为字符串即可。 This also lets you generate files larger than you can keep in memory, which you can't do with DOM.这也使您可以生成比内存中可以保存的文件更大的文件,而使用 DOM 则无法做到这一点。 Reading XML is another story.阅读 XML 是另一回事。

如果您准备使用 minidom,只需扫描回文件并在编写您需要的所有 XML 后删除第一行。

You might be able to use a custom file-like object which removes the first tag, eg:您也许可以使用一个自定义的类文件对象来删除第一个标签,例如:

class RemoveFirstLine:
    def __init__(self, f):
        self.f = f
        self.xmlTagFound = False

    def __getattr__(self, attr):
        return getattr(self, self.f)

    def write(self, s):
        if not self.xmlTagFound:
            x = 0 # just to be safe
            for x, c in enumerate(s):
                if c == '>':
                    self.xmlTagFound = True
                    break
            self.f.write(s[x+1:])
        else:
            self.f.write(s)

...
f = RemoveFirstLine(open('path', 'wb'))
Node.writexml(f, encoding='UTF-8')

or something similar.或类似的东西。 This has the advantage the file doesn't have to be totally rewritten if the XML files are fairly large.这样做的好处是,如果 XML 文件相当大,则不必完全重写文件。

Use string replace使用字符串替换

from xml.dom import minidom
mydoc = minidom.parse('filename.xml')


with open(newfile, "w" ) as fs:
    fs.write(mydoc.toxml().replace('?xml version="1.0" ?>', ''))
    fs.close()

That's it ;)就是这样 ;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM