簡體   English   中英

Java,xml,XSLT:防止DTD驗證

[英]Java, xml, XSLT: Prevent DTD-Validation

我使用Java(6)XML-Api對來自Web的html文檔應用xslt轉換。 這個文檔格式正確,因此包含有效的DTD-Spec( <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> )。 現在出現問題:Uppon轉換XSLT-Processor嘗試下載DTD並且w3-server通過HTTP 503錯誤拒絕這一點(由於w3的Bandwith限制 )。

如何防止XSLT-Processor下載dtd? 我不需要我的輸入文檔驗證。

來源是:

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

-

   String xslt = "<?xml version=\"1.0\"?>"+
   "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">"+
   "    <xsl:output method=\"text\" />"+          
   "    <xsl:template match=\"//html/body//div[@id='bodyContent']/p[1]\"> "+
   "        <xsl:value-of select=\".\" />"+
   "     </xsl:template>"+
   "     <xsl:template match=\"text()\" />"+
   "</xsl:stylesheet>";

   try {
   Source xmlSource = new StreamSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award");
   Source xsltSource = new StreamSource(new StringReader(xslt));
   TransformerFactory ft = TransformerFactory.newInstance();

   Transformer trans = ft.newTransformer(xsltSource);

   trans.transform(xmlSource, new StreamResult(System.out));
   }
   catch (Exception e) {
     e.printStackTrace();
   }

我在這里閱讀了以下問題,但它們都使用了另一個XML-Api:

謝謝!

我最近在使用JAXB解組XML時遇到了這個問題。 答案是從XmlReader和InputSource創建一個SAXSource,然后將其傳遞給JAXB UnMarshaller的unmarshal()方法。 為了避免加載外部DTD,我在XmlReader上設置了一個自定義EntityResolver。

SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xmlr = sp.getXMLReader();
xmlr.setEntityResolver(new EntityResolver() {
    public InputSource resolveEntity(String pid, String sid) throws SAXException {
        if (sid.equals("your remote dtd url here"))
            return new InputSource(new StringReader("actual contents of remote dtd"));
        throw new SAXException("unable to resolve remote entity, sid = " + sid);
    } } );
SAXSource ss = new SAXSource(xmlr, myInputSource);

如上所述,如果要求解析實體以外的其他實體,則該自定義實體解析程序將拋出異​​常,而不是您希望它解析的實體。 如果您只是希望它繼續並加載遠程實體,請刪除“throws”行。

嘗試在DocumentBuilderFactory中設置一個功能:

URL url = new URL(urlString);
InputStream is = url.openStream();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db;
db = dbf.newDocumentBuilder();
Document result = db.parse(is);

現在,當調用文檔函數來分析外部XHTML頁面時,我在XSLT(2)中遇到了同樣的問題。

以前的答案讓我找到了解決方案,但對我來說並不明顯,所以這里有一個完整的答案:

private void convert(InputStream xsltInputStream, InputStream srcInputStream, OutputStream destOutputStream) throws SAXException, ParserConfigurationException,
        TransformerFactoryConfigurationError, TransformerException, IOException {
    //create a parser with a fake entity resolver to disable DTD download and validation
    XMLReader xmlReader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();
    xmlReader.setEntityResolver(new EntityResolver() {
        public InputSource resolveEntity(String pid, String sid) throws SAXException {
            return new InputSource(new ByteArrayInputStream(new byte[] {}));
        }
    });
    //create the transformer
    Source xsltSource = new StreamSource(xsltInputStream);
    Transformer transformer = TransformerFactory.newInstance().newTransformer(xsltSource);
    //create the source for the XML document which uses the reader with fake entity resolver
    Source xmlSource = new SAXSource(xmlReader, new InputSource(srcInputStream));
    transformer.transform(xmlSource, new StreamResult(destOutputStream));
}

如果你使用

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

您可以嘗試使用fllowing代碼禁用dtd驗證:

 dbf.setValidating(false);

您需要使用javax.xml.parsers.DocumentBuilderFactory

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource src = new InputSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award")
Document xmlDocument = builder.parse(src.getByteStream());
DOMSource source = new DOMSource(xmlDocument);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer(xsltSource);
transformer.transform(source, new StreamResult(System.out));

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM