简体   繁体   English

如何通过内部HTML提取dom中的邻接元素

[英]How to extract adjacant elements in the dom by inner html

I would like to extract adjacent DOM elements by searching for innerHtml text. 我想通过搜索innerHtml文本来提取相邻的DOM元素。 The elements are not children of a wrapping parent. 元素不是包装父级的子级。 An example will make it easier to understand: 一个示例将使其更易于理解:

<p>1.</p>
<h1>This is the first paragraph..</h1>
<button>click</button>

<p>2.</p>
<h3>And this is the second...</h3>
<img src="" alt=""/>

<p>3.</p>
<h5>this is the last paragraph</h5>

I would like to find the first element by looking for the inner text of "1." 我想通过查找内部文本“ 1”来找到第一个元素。 and then extract all its siblings until I reach the first element with the inner text of "2." 然后提取其所有同级,直到到达带有内部文本“ 2”的第一个元素。

And then do it with 2 and 3 and so on. 然后用2和3来做,依此类推。 All the elements are siblings. 所有元素都是兄弟姐妹。 The extract could be moving the elements into an array as plain text for example. 例如,摘录可能是将元素移动到纯文本数组中。

Is it possible to achieve? 有可能实现吗? Thanks a lot in advance 提前谢谢

If I understand your question correctly, this could be achieved via the use of the .nextSibling field on DOM nodes. 如果我正确理解您的问题,则可以通过在DOM节点上使用.nextSibling字段来实现。

This would allow you to access the next sibling node to the current node being processed (ie the first p element in your document). 这将允许您访问下一个同级节点到正在处理的当前节点(即文档中的第一个p元素)。 You could use this to iterate through all valid siblings, searching for any with innerText matching your the criteria and adding those to a list of extracted nodes like so: 您可以使用它来遍历所有有效的同级,搜索任何符合您条件的innerText并将其添加到提取的节点列表中,如下所示:

 var extracted = []; /* Get starting node for search. In this case we'll start with the first p element */ var p = document.querySelector('p'); /* Iterate through each sibiling of p */ do { /* If this sibling node has innerText that matches the number pattern required, add this node to the list of extracted nodes */ if(p.innerText && p.innerText.match(/\\d+./gi)) { extracted.push(p.innerText); } /* Move to next sibling */ p = p.nextSibling; } while(p) /* Iterate while sibing is valid */ console.log('Extracted plain text for nodes with number string for innerText:', extracted); 
 <p>1.</p> <h1>This is the first paragraph..</h1> <button>click</button> <p>2.</p> <h3>And this is the second...</h3> <img src="" alt="" /> <p>3.</p> <h5>this is the last paragraph</h5> 

You can check the nextElementSibling with while like the following way: 您可以像下面这样用while检查nextElementSibling

 var arrP = ['1.','2.','3.']; var allP = document.querySelectorAll('p'); allP.forEach(function(p){ if(arrP.includes(p.textContent)){ var siblings = []; elem = p.nextElementSibling; while(elem) { if (elem.nodeName == 'P' || elem.nodeName == 'SCRIPT') break; siblings.push(elem); elem = elem.nextElementSibling; } console.log(siblings); } }); 
 <p>1.</p> <h1>This is the first paragraph..</h1> <button>click</button> <p>2.</p> <h3>And this is the second...</h3> <img src="" alt=""/> <p>3.</p> <h5>this is the last but one paragraph</h5> <p>Not.</p> <h5>this is the last paragraph</h5> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM