[英]Jsoup - How to detect strictly adjacent elements - check if element has been removed
I need to detect strictly adjacent elements with jsoup.我需要使用 jsoup 检测严格相邻的元素。 For this I would use the example provided in How to detect strictly adjacent siblings but I need a working example for Jsoup - java.
为此,我将使用How to detect strictly adjacent siblings中提供的示例,但我需要 Jsoup 的工作示例 - java。
Input输入
<div id="container">
<span class="highlighted">Paragraph 1</span>
<span class="highlighted">Paragraph 2</span>
This is just loose text.
<p class="highlighted">Paragraph 3</p>
</div>
What I'm trying to accomplish is to build a single element with the text of all sibling similar elements.我想要完成的是用所有同级相似元素的文本构建一个元素。
private String removeSimilarTags(String htmlContent){
org.jsoup.nodes.Document doc = Jsoup.parse(htmlContent);
Elements highlightedSpanElements = doc.select("span.highlighted"); //Selecting all spans with class highlight
for(Element span : highlightedSpanElements){
Element beforeEl = span.previousElementSibling();
if(span != null) //I need another function to verify if element has been already removed{
beforeEl.after("<span class='"+HIGHLIGHT+"'>"+mergeAdjacentSpans(span)+"</span>");
}
}
return doc.outerHtml();
}
private String mergeAdjacentSpans(Element span){
Element nextEl = span.nextElementSibling() != null ? span.nextElementSibling() : null;
String text = span.text();
if(nextEl != null && nextEl.tagName().equalsIgnoreCase(SPAN_TAG)
&& nextEl.classNames().contains(HIGHLIGHT)){
//Next Element is also a highlighted span
text = text.concat(" "+ mergeAdjacentSpans(spanEl));
}
span.remove();
return text;
}
And also I would like to have some insights of how to verify an element has been already removed.而且我还想对如何验证元素是否已被删除有一些见解。 I cannot find a clear answer online.
我在网上找不到明确的答案。
Thank you guys !谢谢你们 !
So for detecting if elements are strictly adjacent you should know the difference between Node and Element in Jsoup https://stackoverflow.com/questions/47881838/difference-between-jsoup-element-and-jsoup-node#:~:text=A%20node%20is%20the%20generic,Node .因此,为了检测元素是否严格相邻,您应该知道 Jsoup 中节点和元素之间的区别https://stackoverflow.com/questions/47881838/difference-between-jsoup-element-and-jsoup-node#:~:text= A%20node%20is%20the%20generic,Node . In my case I used Node because it contains whatever elements comes after being a string or an actual element, so it's not tagged element sensitive.
在我的例子中,我使用了 Node,因为它包含作为字符串或实际元素之后出现的任何元素,因此它不是标记元素敏感的。
private boolean isNexSiblingAdjacent(Element span){
Node informationAfterNode = span.nextSibling();
Element nextTaggedElement = span.nextElementSibling();
return informationAfterNode.outerHtml().trim().length() == 0 ||
informationAfterNode.outerHtml().equalsIgnoreCase(nextTaggedElement.outerHtml());
}
So the first condition I do is to verify that it only has blank spaces inside but you can check if it starts with <.- and it ends with -> to check if it is a comment too.所以我做的第一个条件是验证它里面只有空格,但你可以检查它是否以 <.- 开头并以 -> 结尾以检查它是否也是注释。 As these two conditions will make it still adjacent.
由于这两个条件将使它仍然相邻。 And last but no least check if the html of the node is similar to the one in element.
最后但同样重要的是检查节点的 html 是否与元素中的相似。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.