如果选择包含相同嵌套元素的整个树，如何仅获取 DOM 元素的内容一次？

Question

For example: Targeting the container element (div#container) by id (getElementById) returns an HTML Collection containing each element and all of it's properties including the children nodes repeated within each nested item.例如：通过 id (getElementById) 定位容器元素 (div#container) 会返回一个 HTML 集合，其中包含每个元素及其所有属性，包括在每个嵌套项中重复的子节点。 Then I iterate each item into an array but I'm left with the same data repeated in each level of the DOM tree.然后我将每个项目迭代到一个数组中，但在 DOM 树的每个级别中都会重复相同的数据。


    0: <div class="container"><div><div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright:  </p></div></footer></main></div></div></div>
    1: <div><div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main></div></div>
    2: <div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main></div>
    3: <main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main>
    4: <footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer>
    5: <div class="container-fluid"><p> © 2018-2020 Copyright: </p></div>
    6: <p> © 2018-2020 Copyright: </p>

What I'd like to do is grab the actual content only once (eg <p> © 2018-2020 Copyright: </p> ) -- and associate it with the correlating XPath location -- in order to re-assemble the HTML document later with just the containing structures above holding the element tags and attributes;我想要做的是只获取一次实际内容（例如<p> © 2018-2020 Copyright: </p> ）——并将其与相关的 XPath 位置相关联——以便重新组装 HTML稍后使用上面的包含结构保存元素标签和属性； only inserting the content into the last child of the node as illustrated below:仅将内容插入节点的最后一个子节点，如下所示：

/DIV/DIV/DIV/MAIN/FOOTER/ --> `<div class="container-fluid"><p></p></div>`

/DIV/DIV/DIV/MAIN/FOOTER/DIV --> `<p></p>`

/DIV/DIV/DIV/MAIN/FOOTER/DIV/P --> © 2018-2020 Copyright:

Background/Context: The goal of this is reduce redundancies in my array object in order to construct an efficient payload (eventually stringified to JSON) to send off to the Microsoft Translator API so that I'm not translating the same content nodes unnecessarily in order to reconstruct the translated page by injecting the translation text response back into their original DOM locations using XPath and jQuery.背景/上下文：这样做的目的是减少数组对象中的冗余，以构建有效的有效负载（最终字符串化为 JSON）以发送到 Microsoft Translator API，这样我就不会不必要地按顺序翻译相同的内容节点通过使用 XPath 和 jQuery 将翻译文本响应注入回其原始 DOM 位置来重建翻译页面。

So far I've used jQuery and the TreeWalker Web API ( https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker ) to accomplish what I have so far...到目前为止，我已经使用 jQuery 和 TreeWalker Web API ( https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker ) 来完成我迄今为止所做的......

JavaScript: JavaScript：



       // Get all element nodes of page
        var content = document.getElementById('container');


        //array for DOM elements
        var b = [];

        function elementNodesUnder(el) {

          var n;

          nodeFilter = function(node) {

            if (node.innerHTML && node.tagName !== 'SCRIPT' && node.tagName !==
            'STYLE' && node.tagName !== 'svg' && node.tagName !== 'I' &&
            node.tagName !== 'VIDEO') {   return NodeFilter.FILTER_ACCEPT;

            } else {

              return NodeFilter.FILTER_SKIP;
            }

          };

          walk = document.createTreeWalker(

            el,
            NodeFilter.SHOW_ELEMENT,
            nodeFilter,
            false

          );

          while ((n = walk.nextNode())) b.push(n);
          return b;
        }


        elementNodesUnder(content);
        console.log(b);

    //array variables for xpath + innerHTML collections
    var xPathArray = [];
    var innerHTMLdinner = [];


    //loop through text nodes & assign xPath
    $.each(b, function(i, c) {

        if (c.innerHTML) {

          //console.log(i+" "+getElementXPath(c)+" = "+c.innerHTML);

          //push each corresponding item to an array for xpath + innerHTML
          xPathArray.push(getElementXPath(c));
          innerHTMLdinner.push(c.innerHTML);

        }

      });

      //map the xPath and innerHTML arrays together and then stringify
    var xpathNodeMap = xPathArray.map((xPathers, index) => ({xPathArray: xPathers, innerHTML: innerHTMLdinner[index]}));
    var xpathNodeMapJSON = JSON.stringify(xpathNodeMap);
    console.log(xpathNodeMapJSON);


      // given a document element returns the xpath string expression of that element.

      function getElementXPath(elt) {

        var path = '';

        for (; elt && elt.nodeType == 1; elt = elt.parentNode) {

          idx = getElementIdx(elt);
          xname = elt.tagName;
          if (idx > 1) xname += '[' + idx + ']';
          path = '/' + xname + path;

        }

        return path;

      }



      function getElementIdx(elt) {

        var count = 1;

        for (var sib = elt.previousSibling; sib; sib = sib.previousSibling) {

          if (sib.nodeType == 1 && sib.tagName == elt.tagName) count++;

        }


        return count;

      }

HTML Example : HTML 示例：

<html>

<body>

<div></div>
<div></div>
<div></div>
<div></div>
<div></div>

<div id="container">

    <div class="layout">

        <div class="bodyContainer">

            <main class="wrapper">

                <footer class="full-standard">

                    <div class="container no-print">

                        <div class="row">

                            <img alt="" src="" />

                        </div> <!-- footer > div.row -->

                    </div> <!-- /div.container.no-print -->

                    <div class="footer-copyright>

                        <div class="container-fluid">

                            <p>&copy; 2020 Copyright</p>

                        </div> <!-- /div.container-fluid -->

                    </div> <!-- /div.footer-copyright -->

                </footer> <!-- /footer.full-standard -->

            </main> <!-- /main.wrapper -->

        </div> <!-- /div.bodyContainer-->

    </div> <!--/div.layout -->


</div> <!-- / div#container -->

</body>

</html>

XPath Results Example : XPath 结果示例：

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]",

    "innerHTML": "<div class=\"container-fluid\"><p> © 2018-2020 Copyright: </p></div>"
}, 

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]/DIV",

    "innerHTML": "<p> © 2018-2020 Copyright: </p>"
}, 

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]/DIV/P",

    "innerHTML": " © 2018-2020 Copyright: "
}

Surprisingly I couldn't find anything too close to this question yet so I apologize if I missed it but any help to point me in the right direction would be extremely appreciated.令人惊讶的是，我还没有找到任何与这个问题太接近的东西，所以如果我错过了它，我深表歉意，但任何帮助我指明正确方向的帮助将不胜感激。 Thanks!谢谢！

Answer 1

Try assigning the element you want a unique id - then grabbing that element by the id and passing the innerText of that element to your handler?尝试为您想要的元素分配一个唯一的 id - 然后通过 id 抓取该元素并将该元素的 innerText 传递给您的处理程序？

   <p id='unique_id'> Some Text </p>

    document.getElementbById('unique_id')[0].innerHTML

you might need to fiddle with this a touch, but the general idea should work您可能需要稍微调整一下，但总体思路应该可行

如果选择包含相同嵌套元素的整个树，如何仅获取 DOM 元素的内容一次？

问题描述

1 个解决方案

解决方案1
0 2020-02-20 19:08:24

如果选择包含相同嵌套元素的整个树，如何仅获取 DOM 元素的内容一次？

问题描述

1 个解决方案

解决方案1 0 2020-02-20 19:08:24

解决方案1
0 2020-02-20 19:08:24