简体   繁体   English

SaxonJS:没有得到真正的 HTML DOM 树

[英]SaxonJS: not getting an real HTML DOM tree

I have a complicated stylesheet which when done executing should replace not just the body of the current html element, but the entire html element with all the children, head and body.我有一个复杂的样式表,执行完成后应该不仅替换当前 html 元素的主体,而且替换整个 html 元素及其所有子元素、头部和主体。

function applyStylesheetUsingSaxon() {
    SaxonJS.setLogLevel(10);
    options = {
      sourceText: getSourceXML(),
      stylesheetLocation: "spl.sef.json",
      destination: "document"
    };
    result = SaxonJS.transform(options);
    document.replaceChildren();
    document.appendChild(result.principalResult.firstElementChild);
}

When I do this, some FairAdBlocker extension will try to access document.body and get null and then nothing works any more.当我这样做时,某些 FairAdBlocker 扩展程序将尝试访问 document.body 并获取 null,然后就不再起作用了。 The produced html also does look suspicious.生成的 html 看起来也很可疑。

I thought maybe I just get HTML as a string and can put it as innerHtml or so, with:我想也许我只是得到 HTML 作为一个字符串,并且可以把它作为 innerHtml 左右,用:

  destination: "serialized"
};
result = SaxonJS.transform(options);

the error I get is:我得到的错误是:

Serializer does not support the requested HTML version: 1.0',序列化程序不支持请求的 HTML 版本:1.0',

So how can I get a decent HTML DOM produced?那么我怎样才能得到一个像样的 HTML DOM 呢? Note, I can't use replaceBody because I need to replace the headers as well.请注意,我不能使用 replaceBody,因为我还需要替换标头。

MORE DETAILS: I'm responding to some clarifying question in the comments:更多详细信息:我正在回应评论中的一些澄清问题:

Is the DOM up to that approach to rip of all children from a document to then add a new root element? DOM 是否符合从文档中删除所有子元素然后添加新根元素的方法?

This has worked just fine with the result from the built-in XSLTProcessor in Chrome.这对于 Chrome 中内置的 XSLTProcessor 的结果来说效果很好。

What does that talk about the extension mean, can you disable that and check whether your approach works without the extension interfering?关于扩展的讨论是什么意思,您可以禁用它并检查您的方法是否在没有扩展干扰的情况下有效吗?

That extension issue is a red herring.该扩展问题是一个转移注意力的问题。 The fact is that the document.body returns nothing after the replacement operation, and the nodes look differently.事实是 document.body 在替换操作后什么都不返回,节点看起来不同。

Also, what is a minimal but complete sample of XML input, XSLT, wanted HTML result and the "suspicious" HTML you say you get?此外,XML 输入的最小但完整样本是什么,XSLT,想要 HTML 结果和你说你得到的“可疑”HTML?

Here I show you the difference just looking at the output of the transform.在这里,我向您展示了仅查看变换的 output 的区别。 This is debugging console output using the built-in XSLTProcessor这是使用内置 XSLTProcessor 的调试控制台 output

在此处输入图像描述

here is what comes from Saxon-JS:这是来自 Saxon-JS 的内容:

在此处输入图像描述

and I guess the issue is that the result is a #document-fragment.我想问题是结果是#document-fragment。 Here is some view of the details looked at as javascript objects rather than as HTML. First what comes from the built-in XSLTProcessor:这是一些细节视图,被视为 javascript 个对象而不是 HTML。首先是来自内置 XSLTProcessor 的内容:

在此处输入图像描述

and here is what comes from Saxon-JS:这是来自 Saxon-JS 的内容:

在此处输入图像描述

As for the serialization attempt, well, do you use the HTML output method with eg html-version="5.0" or no explicitly set version or html-version?至于序列化尝试,那么,您是否将 HTML output 方法与例如 html-version="5.0" 或未明确设置的版本或 html-version 一起使用? The error sounds as if you set method="html" version="1.0".该错误听起来好像您设置了 method="html" version="1.0"。

Indeed we have确实我们有

<xsl:output method="html" version="1.0" encoding="UTF-8" indent="no" doctype-public="-"/>

which I think was to fiddle with quirks mode or something because of former need to be compatible to IE.我认为这是在摆弄怪癖模式或其他东西,因为前者需要与 IE 兼容。

IN SUMMARY: I think the analysis of what we get out from the built-in XSLTProcessor (document) vs. Saxon-JS (document-fragment).总结:我认为分析我们从内置的 XSLTProcessor(文档)与 Saxon-JS(文档片段)中得到了什么。 If Saxon-JS actually was to produce a document and have an destination option to replace the entire document content, then it would be great.如果 Saxon-JS 实际上是要生成一个文档并有一个目标选项来替换整个文档内容,那就太好了。 Not having that, I should still be able to make a workaround.没有那个,我应该仍然能够做出解决方法。

What I don't understand is why, when I take the root node () from the built-in XSLTProcessor result #document.firstElementChild and append it to the current (and empty) document, then that document.body property comes with the new body.我不明白的是,当我从内置的 XSLTProcessor 结果 #document.firstElementChild 和 append 中获取根节点 () 到当前(和空)文档时,那个 document.body 属性带有新的身体。 But when I do the same with the Saxon-JS result #document-fragment.firstElementChild then the document.body returns null, despite the two.firstElementChild () root nodes being pretty much the same kind of thing in both cases.但是当我对 Saxon-JS 结果执行相同操作时 #document-fragment.firstElementChild 然后 document.body 返回 null,尽管 two.firstElementChild () 根节点在两种情况下几乎是同一类东西。 (Hard to tell the difference, but neither has a "body" property, both have two children, and. (很难区分,但两者都没有“身体”属性,都有两个孩子,并且。

Here at https://martin-honnen.github.io/xslt/2022/replaceChildrenTest4.html is an example using Saxon-JS 2.3 to run a HTML DOM to HTML DOM transformation and then using your first approach of replaceChildren() to first remove the existing document's children to, in the second step, appendChild the result of the Saxon-JS transformation.这里是https://martin-honnen.github.io/xslt/2022/replaceChildrenTest4.html是使用 Saxon-JS 2.3 运行 HTML DOM 到replaceChildren()的示例在第二步中,删除现有文档的子项,将 Saxon-JS 转换的结果附加到appendChild中。

<html lang="en">
  <head>
    <meta charset="UTF-8">
    <title>Test</title>
    <style>
    .sample {
      color: red;
    }
    </style>
    <script src="../../Saxon-JS-2.3/SaxonJS2.js"></script>
    <script>
    function runXslt() {
      const xslt = `<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" xpath-default-namespace="http://www.w3.org/1999/xhtml">
        <xsl:output method="html"/>
        <xsl:mode on-no-match="shallow-copy"/>
        <xsl:template match="meta[@charset]"/>
        <xsl:template match="style">
          <xsl:copy>
            <xsl:value-of select="if (contains(., 'red')) then replace(., 'red', 'green') else replace(., 'green', 'red')"/>
          </xsl:copy>
       </xsl:template>
       </xsl:stylesheet>`;
       
       var resultFragment = SaxonJS.XPath.evaluate(`transform(map {
         'source-node' : .,
         'stylesheet-text' : $xslt
         })?output`,
         document,
         { params : { xslt : xslt } }
      );
         
      console.log(resultFragment);
      
      document.replaceChildren();
      document.appendChild(resultFragment);
    }
    </script>
  </head>
  <body>
    <h1>Test</h1>
    <p class="sample">This is a test.</p>
    <input type="button" value="test" onclick="runXslt();">
  </body>
</html>

For debugging ease I have avoided the use of SEF and directly run the XSLT source code through fn:transform but I would think the same approach would work with a precompiled SEF and the SaxonJS.transform API. A test doing that is at https://martin-honnen.github.io/xslt/2022/replaceChildrenTest5.html and works the same.为了便于调试,我避免使用 SEF 并通过fn:transform直接运行 XSLT 源代码,但我认为相同的方法适用于预编译的 SEF 和 SaxonJS.transform API。这样做的测试位于https:/ /martin-honnen.github.io/xslt/2022/replaceChildrenTest5.html并且工作方式相同。

I suspect that the problem is in the FairAdBlocker extension.我怀疑问题出在 FairAdBlocker 扩展中。 The technique you outline works fine for me.您概述的技术对我来说效果很好。 My guess is that the FairAdBlocker detects the DOM change, probably when you do removeChildren() and throws an NPE which causes the JavaScript execution to stop (or something).我的猜测是 FairAdBlocker 检测到 DOM 更改,可能是在您执行removeChildren()并抛出 NPE 时导致 JavaScript 执行停止(或其他)。

Here's a (somewhat crudely coded) solution that replaces the contents of the head and body elements without ever removing them.这是一个(有些粗略编码的)解决方案,它替换了headbody元素的内容而不删除它们。 Perhaps the extension will let this pass...也许扩展会让这个通过......

function applyStylesheetUsingSaxon() {
    SaxonJS.setLogLevel(10);
    options = {
      sourceText: "<doc><title>Spoon!</title><para>Hello.</para></doc>",
      stylesheetLocation: "replace.sef.json",
      destination: "document"
    };
    let result = SaxonJS.transform(options)
    let newhtml = result.principalResult.firstChild;
    // Hack: we just assume that the current and generated pages
    // are rooted at html and contain a single head and a single body
    let oldhtml = document.querySelector("html");
    replaceChildren(oldhtml, newhtml, "head");
    replaceChildren(oldhtml, newhtml, "body");
}

function replaceChildren(oldelem, newelem, name) {
   let src = null;
   for (let pos = 0; pos < newelem.childNodes.length; pos++) {
     if (newelem.childNodes[pos].localName == name) {
       src = newelem.childNodes[pos];
       break;
     }
   }

   let tgt = null;
   for (let pos = 0; pos < oldelem.childNodes.length; pos++) {
     if (oldelem.childNodes[pos].localName == name) {
       tgt = oldelem.childNodes[pos];
       break;
     }
   }
   
   if (src == null || tgt == null) {
     // This should never happen…
     console.log("Failed to find " + name);
     return;
   }

   while (tgt.childNodes.length > 0) {
     tgt.removeChild(tgt.childNodes[0]);
   }
   while (src.childNodes.length > 0) {
     // This append removes the node from newelem
     tgt.appendChild(src.childNodes[0]);
   }
}  

Replacing the entire document isn't our recommended approach: https://www.saxonica.com/saxon-js/documentation2/index.html#!browser/result-documents替换整个文档不是我们推荐的方法: https://www.saxonica.com/saxon-js/documentation2/index.html#!browser/result-documents

Can you say a little more about why you want to do it this way?你能多说一点你为什么要这样做吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM