简体   繁体   English

Jsoup - 如何检测严格相邻的元素 - 检查元素是否已被删除

[英]Jsoup - How to detect strictly adjacent elements - check if element has been removed

I need to detect strictly adjacent elements with jsoup.我需要使用 jsoup 检测严格相邻的元素。 For this I would use the example provided in How to detect strictly adjacent siblings but I need a working example for Jsoup - java.为此,我将使用How to detect strictly adjacent siblings中提供的示例,但我需要 Jsoup 的工作示例 - java。

Input输入

<div id="container">
    <span class="highlighted">Paragraph 1</span>
    <span class="highlighted">Paragraph 2</span>
    This is just loose text.
    <p class="highlighted">Paragraph 3</p>
</div>

What I'm trying to accomplish is to build a single element with the text of all sibling similar elements.我想要完成的是用所有同级相似元素的文本构建一个元素。

private String removeSimilarTags(String htmlContent){
        org.jsoup.nodes.Document doc = Jsoup.parse(htmlContent);

        Elements highlightedSpanElements = doc.select("span.highlighted"); //Selecting all spans with class highlight
        for(Element span : highlightedSpanElements){
            Element beforeEl = span.previousElementSibling();
            if(span != null) //I need another function to verify if element has been already removed{
                beforeEl.after("<span class='"+HIGHLIGHT+"'>"+mergeAdjacentSpans(span)+"</span>");
            }
        }
        return doc.outerHtml();
    }

 private String mergeAdjacentSpans(Element span){
        Element nextEl = span.nextElementSibling() != null ? span.nextElementSibling() : null;
       
        String text = span.text();
        if(nextEl != null && nextEl.tagName().equalsIgnoreCase(SPAN_TAG)
                          && nextEl.classNames().contains(HIGHLIGHT)){
            //Next Element is also  a highlighted span
           text =  text.concat(" "+ mergeAdjacentSpans(spanEl));
        }
        span.remove();
        return text;
    }

And also I would like to have some insights of how to verify an element has been already removed.而且我还想对如何验证元素是否已被删除有一些见解。 I cannot find a clear answer online.我在网上找不到明确的答案。

Thank you guys !谢谢你们 !

So for detecting if elements are strictly adjacent you should know the difference between Node and Element in Jsoup https://stackoverflow.com/questions/47881838/difference-between-jsoup-element-and-jsoup-node#:~:text=A%20node%20is%20the%20generic,Node .因此,为了检测元素是否严格相邻,您应该知道 Jsoup 中节点和元素之间的区别https://stackoverflow.com/questions/47881838/difference-between-jsoup-element-and-jsoup-node#:~:text= A%20node%20is%20the%20generic,Node . In my case I used Node because it contains whatever elements comes after being a string or an actual element, so it's not tagged element sensitive.在我的例子中,我使用了 Node,因为它包含作为字符串或实际元素之后出现的任何元素,因此它不是标记元素敏感的。

private boolean isNexSiblingAdjacent(Element span){
  Node informationAfterNode = span.nextSibling();
  Element nextTaggedElement = span.nextElementSibling();
  return informationAfterNode.outerHtml().trim().length() == 0 ||
 informationAfterNode.outerHtml().equalsIgnoreCase(nextTaggedElement.outerHtml());
}

So the first condition I do is to verify that it only has blank spaces inside but you can check if it starts with <.- and it ends with -> to check if it is a comment too.所以我做的第一个条件是验证它里面只有空格,但你可以检查它是否以 <.- 开头并以 -> 结尾以检查它是否也是注释。 As these two conditions will make it still adjacent.由于这两个条件将使它仍然相邻。 And last but no least check if the html of the node is similar to the one in element.最后但同样重要的是检查节点的 html 是否与元素中的相似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM