简体   繁体   English

如何在Java中使用W3C DOM创建一个空的DOCTYPE?

[英]How to create an empty DOCTYPE using W3C DOM in Java?

I am trying to read an XML document and output it into a new XML document using the W3C DOM API in Java . 我正在尝试使用Java中W3C DOM API读取XML文档并将其输出到新的XML文档 To handle DOCTYPEs, I am using the following code (from an input Document doc to a target File target ): 要处理DOCTYPE,我使用以下代码(从输入Document doc到目标File target ):

TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no"); // omit '<?xml version="1.0"?>'
trans.setOutputProperty(OutputKeys.INDENT, "yes");

// if a doctype was set, it needs to persist
if (doc.getDoctype() != null) {
    DocumentType doctype = doc.getDoctype();
    trans.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, doctype.getSystemId());
    trans.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, doctype.getPublicId());
}

FileWriter sw = new FileWriter(target);
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);

This works fine for both XML documents with and without DOCTYPEs. 对于包含和不包含DOCTYPE的XML文档,这都适用。 However, I am now coming across a NullPointerException when trying to transform the following input XML document: 但是,在尝试转换以下输入XML文档时,我现在遇到NullPointerException

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE permissions >
<permissions>
  // ...
</permissions>

HTML 5 uses a similar syntax for its DOCTYPEs, and it is valid . HTML 5对其DOCTYPE使用类似的语法, 它是有效的 But I have no idea how to handle this using the W3C DOM API - trying to set the DOCTYPE_SYSTEM to null throws an exception. 但我不知道如何使用W3C DOM API处理这个问题 - 尝试将DOCTYPE_SYSTEM设置为null会引发异常。 Can I still use the W3C DOM API to output an empty doctype? 我还可以使用W3C DOM API输出空的doctype吗?

Although this question is two years old, it is a top search result in some web search engine, so maybe it is a useful shortcut. 虽然这个问题已经有两年了,但它在某些网络搜索引擎中是一个顶级的搜索结果,所以它可能是一个有用的捷径。 See the question Set HTML5 doctype with XSLT referring to http://www.w3.org/html/wg/drafts/html/master/syntax.html#doctype-legacy-string , which says: 请参阅http://www.w3.org/html/wg/drafts/html/master/syntax.html#doctype-legacy-string ,查看使用XSLT设置HTML5 doctype的问题,其中说:

For the purposes of HTML generators that cannot output HTML markup with the short DOCTYPE " <!DOCTYPE html> ", a DOCTYPE legacy string may be inserted into the DOCTYPE [...] 对于无法使用短DOCTYPE“ <!DOCTYPE html> ”输出HTML标记的HTML生成器,可以将DOCTYPE遗留字符串插入DOCTYPE [...]

In other words, <!DOCTYPE html SYSTEM "about:legacy-compat"> or <!DOCTYPE html SYSTEM 'about:legacy-compat'> , case-insensitively except for the part in single or double quotes. 换句话说, <!DOCTYPE html SYSTEM "about:legacy-compat"><!DOCTYPE html SYSTEM 'about:legacy-compat'> ,不区分大小写,但单引号或双引号中的部分除外。

Leading to a line of Java code like this: 导致一系列Java代码如下:

trans.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "about:legacy-compat");

Try the suggestions here https://stackoverflow.com/a/6637886/116509 . 尝试这里的建议https://stackoverflow.com/a/6637886/116509 Basically it looks like it can't be done with standard Java DOM support. 基本上看起来它不能用标准的Java DOM支持来完成。

You can also try StAX 您也可以尝试StAX

    XMLStreamWriter xmlStreamWriter =
        XMLOutputFactory.newFactory().createXMLStreamWriter( System.out, doc.getXmlEncoding() );
    Result result = new StAXResult( xmlStreamWriter );
    // ... create dtd String 
    xmlStreamWriter.writeDTD( dtd );
    DOMSource source = new DOMSource( doc );
    trans.transform( source, result );

but it's ugly because the DTD parameter is a String , and you only have a DocumentType object. 但它很难看,因为DTD参数是一个String ,而你只有一个DocumentType对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM