繁体   English   中英

使用JS从网页中删除评论之间的元素

[英]Remove an Element between Comments from Webpage using JS

我正在尝试从此网页收集数据:

https://www.biharjobportal.com/bihar-police-constable-bharti/

我设法使用此代码从网站上删除了所有 GoogleAds 因为它有一个类名,所以很简单:

 var theaders = document.getElementsByClassName('adsbygoogle');
for (var i=theaders.length-1; i >=0; i--)
{
    theaders[i].parentElement.removeChild(theaders[i]);
}

但是网页上有这个没有IDS、类名等的元素。(请看截图):

在此处输入图片说明

我只知道要删除的元素在这些评论之间:

     <!-- WP QUADS Content Ad Plugin v. 2.0.17  -->

    **codes to remove (as in the picture)**

    <!-- WP QUADS Content Ad Plugin v. 2.0.17  -->

我尝试使用 XPATH 删除所有此类项目,但什么也没发生,这是我写的代码:

    var badTableEval = document.evaluate (
    "/html/body/div[1]/div/div[1]/main/article/div/div/ul[3]",
    document.documentElement,
    null,
    XPathResult.FIRST_ORDERED_NODE_TYPE,
    null
);

if (badTableEval  &&  badTableEval.singleNodeValue) {
    var badTable  = badTableEval.singleNodeValue;
    badTable.parentNode.removeChild (badTable);
}

如何从网页中删除所有这些元素? https://www.biharjobportal.com/bihar-police-constable-bharti/

您可以通过这种方式检测文档中的注释(请参阅代码段)。 现在由您来设置一些巧妙的功能来删除评论之间的元素。 . 好的,你要求它,包括删除相等评论之间的元素的方法。

 const root = document.querySelector("body"); const allEls = [...root.childNodes]; const IS_COMMENT = 8; allEls.forEach((el, i) => { if (el.nodeType === IS_COMMENT) { // we have a comment. Find the (index of) next equal comment in [allEls] // from this point on const subset = allEls.slice(i + 1); const hasEqualNextComment = subset .findIndex(elss => elss.nodeType === IS_COMMENT && elss.textContent.trim() === el.textContent.trim()); // if an equal comment has been found, remove every element between // the two comment elements if (hasEqualNextComment > -1) { subset.slice(1, hasEqualNextComment - 1) .forEach(elss => elss.parentNode && elss.parentNode.removeChild(elss)); } } });
 body { font: normal 12px/15px verdana, arial; margin: 2rem; }
 <!-- WP QUADS Content Ad Plugin v. 2.0.17 --> <ul> <li>item 1</li> <li>item 2</li> <li>item 3</li> </ul> <!-- WP QUADS Content Ad Plugin v. 2.0.17 --> <!-- other comment --> <ul> <li>item 4</li> <li>item 5</li> <li>item 6</li> </ul> <!-- other comment: the above is kept --> <!-- something 2 remove --> <div>item 7</div> <!--something 2 remove--> <div>item 8</div> <p> <b>The result should show item 4 - item 6, item 8 and the text within this paragraph</b>. <br><i>Note</i>: this will only work for top level comments within the given [root] (so, not for comments that nested within elements). <br>Also you may have to clean multiline-comments from line endings for comparison. </p>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM