简体   繁体   English

如果选择包含相同嵌套元素的整个树,如何仅获取 DOM 元素的内容一次?

[英]How can I grab the content of a DOM element only once if selecting an entire tree which also contains those same nested elements?

For example: Targeting the container element (div#container) by id (getElementById) returns an HTML Collection containing each element and all of it's properties including the children nodes repeated within each nested item.例如:通过 id (getElementById) 定位容器元素 (div#container) 会返回一个 HTML 集合,其中包含每个元素及其所有属性,包括在每个嵌套项中重复的子节点。 Then I iterate each item into an array but I'm left with the same data repeated in each level of the DOM tree.然后我将每个项目迭代到一个数组中,但在 DOM 树的每个级别中都会重复相同的数据。


    0: <div class="container"><div><div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright:  </p></div></footer></main></div></div></div>
    1: <div><div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main></div></div>
    2: <div><main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main></div>
    3: <main><footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer></main>
    4: <footer><div class="container-fluid"><p> © 2018-2020 Copyright: </p></div></footer>
    5: <div class="container-fluid"><p> © 2018-2020 Copyright: </p></div>
    6: <p> © 2018-2020 Copyright: </p>

What I'd like to do is grab the actual content only once (eg <p> © 2018-2020 Copyright: </p> ) -- and associate it with the correlating XPath location -- in order to re-assemble the HTML document later with just the containing structures above holding the element tags and attributes;我想要做的是只获取一次实际内容(例如<p> © 2018-2020 Copyright: </p> )——并将其与相关的 XPath 位置相关联——以便重新组装 HTML稍后使用上面的包含结构保存元素标签和属性; only inserting the content into the last child of the node as illustrated below:仅将内容插入节点的最后一个子节点,如下所示:

/DIV/DIV/DIV/MAIN/FOOTER/ --> `<div class="container-fluid"><p></p></div>`

/DIV/DIV/DIV/MAIN/FOOTER/DIV --> `<p></p>`

/DIV/DIV/DIV/MAIN/FOOTER/DIV/P --> © 2018-2020 Copyright:

Background/Context: The goal of this is reduce redundancies in my array object in order to construct an efficient payload (eventually stringified to JSON) to send off to the Microsoft Translator API so that I'm not translating the same content nodes unnecessarily in order to reconstruct the translated page by injecting the translation text response back into their original DOM locations using XPath and jQuery.背景/上下文:这样做的目的是减少数组对象中的冗余,以构建有效的有效负载(最终字符串化为 JSON)以发送到 Microsoft Translator API,这样我就不会不必要地按顺序翻译相同的内容节点通过使用 XPath 和 jQuery 将翻译文本响应注入回其原始 DOM 位置来重建翻译页面。

So far I've used jQuery and the TreeWalker Web API ( https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker ) to accomplish what I have so far...到目前为止,我已经使用 jQuery 和 TreeWalker Web API ( https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker ) 来完成我迄今为止所做的......

JavaScript: JavaScript:



       // Get all element nodes of page
        var content = document.getElementById('container');


        //array for DOM elements
        var b = [];

        function elementNodesUnder(el) {

          var n;

          nodeFilter = function(node) {

            if (node.innerHTML && node.tagName !== 'SCRIPT' && node.tagName !==
            'STYLE' && node.tagName !== 'svg' && node.tagName !== 'I' &&
            node.tagName !== 'VIDEO') {   return NodeFilter.FILTER_ACCEPT;

            } else {

              return NodeFilter.FILTER_SKIP;
            }

          };

          walk = document.createTreeWalker(

            el,
            NodeFilter.SHOW_ELEMENT,
            nodeFilter,
            false

          );

          while ((n = walk.nextNode())) b.push(n);
          return b;
        }


        elementNodesUnder(content);
        console.log(b);

    //array variables for xpath + innerHTML collections
    var xPathArray = [];
    var innerHTMLdinner = [];


    //loop through text nodes & assign xPath
    $.each(b, function(i, c) {

        if (c.innerHTML) {

          //console.log(i+" "+getElementXPath(c)+" = "+c.innerHTML);

          //push each corresponding item to an array for xpath + innerHTML
          xPathArray.push(getElementXPath(c));
          innerHTMLdinner.push(c.innerHTML);

        }

      });

      //map the xPath and innerHTML arrays together and then stringify
    var xpathNodeMap = xPathArray.map((xPathers, index) => ({xPathArray: xPathers, innerHTML: innerHTMLdinner[index]}));
    var xpathNodeMapJSON = JSON.stringify(xpathNodeMap);
    console.log(xpathNodeMapJSON);


      // given a document element returns the xpath string expression of that element.

      function getElementXPath(elt) {

        var path = '';

        for (; elt && elt.nodeType == 1; elt = elt.parentNode) {

          idx = getElementIdx(elt);
          xname = elt.tagName;
          if (idx > 1) xname += '[' + idx + ']';
          path = '/' + xname + path;

        }

        return path;

      }



      function getElementIdx(elt) {

        var count = 1;

        for (var sib = elt.previousSibling; sib; sib = sib.previousSibling) {

          if (sib.nodeType == 1 && sib.tagName == elt.tagName) count++;

        }


        return count;

      }

HTML Example : HTML 示例

<html>

<body>

<div></div>
<div></div>
<div></div>
<div></div>
<div></div>

<div id="container">

    <div class="layout">

        <div class="bodyContainer">

            <main class="wrapper">

                <footer class="full-standard">

                    <div class="container no-print">

                        <div class="row">

                            <img alt="" src="" />

                        </div> <!-- footer > div.row -->

                    </div> <!-- /div.container.no-print -->

                    <div class="footer-copyright>

                        <div class="container-fluid">

                            <p>&copy; 2020 Copyright</p>

                        </div> <!-- /div.container-fluid -->

                    </div> <!-- /div.footer-copyright -->

                </footer> <!-- /footer.full-standard -->

            </main> <!-- /main.wrapper -->

        </div> <!-- /div.bodyContainer-->

    </div> <!--/div.layout -->


</div> <!-- / div#container -->

</body>

</html>

XPath Results Example : XPath 结果示例

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]",

    "innerHTML": "<div class=\"container-fluid\"><p> © 2018-2020 Copyright: </p></div>"
}, 

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]/DIV",

    "innerHTML": "<p> © 2018-2020 Copyright: </p>"
}, 

{
    "xPathArray": "/HTML/BODY/DIV[6]/DIV/DIV/MAIN/FOOTER/DIV[2]/DIV/P",

    "innerHTML": " © 2018-2020 Copyright: "
}

Surprisingly I couldn't find anything too close to this question yet so I apologize if I missed it but any help to point me in the right direction would be extremely appreciated.令人惊讶的是,我还没有找到任何与这个问题太接近的东西,所以如果我错过了它,我深表歉意,但任何帮助我指明正确方向的帮助将不胜感激。 Thanks!谢谢!

Try assigning the element you want a unique id - then grabbing that element by the id and passing the innerText of that element to your handler?尝试为您想要的元素分配一个唯一的 id - 然后通过 id 抓取该元素并将该元素的 innerText 传递给您的处理程序?

   <p id='unique_id'> Some Text </p>

    document.getElementbById('unique_id')[0].innerHTML

you might need to fiddle with this a touch, but the general idea should work您可能需要稍微调整一下,但总体思路应该可行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获取包含当前选择的DOM元素? - How can I get the DOM element which contains the current selection? 多个元素使用相同的 css 类,如何仅更改其中一个元素的 css 属性 - Several elements use the same css class, how can I change the css properties of only one of those elements 如何防止嵌套在另一个也使用onclick的元素上的链接上的click事件? - How I can prevent the click event on a link which is nested on another element which also use onclick? 选择嵌套的DOM元素 - Selecting nested DOM element 我有一个带有选择的创建元素的表单。 我该怎么做才能创建一次元素? - I have a form with select which creates elements. How I can do that elements creates only once? 在 vuejs 表单中,如何只抓取可见元素的数据? - In a vuejs form, how can I only grab the data of the visible elements? 如何检查我的DOM是否已包含ID的元素? - How can i check if my DOM already contains an element by id? 如果DOM元素包含类,我如何检入JavaScript? - How can I check in JavaScript if a DOM element contains a class? 如何隐藏不包含文本的DOM元素? - How can I hide a DOM element that contains no text? 如何获得DOM元素(也嵌套)相对于主体的位置? - How do I get the position of a DOM element (also nested) in relation to body?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM