简体   繁体   English

如何使用jsoup取消注释html标签

[英]How to uncomment html tags using jsoup

I wonder if it is possible to uncomment html tags using jsoup for instance change : 我想知道是否可以使用jsoup取消注释html标签以进行实例更改:

<!--<p> foo bar </p>-->

to

<p> foo bar </p>

Yes it is possible. 对的,这是可能的。 Here is one way to solve this: 以下是解决此问题的一种方法:

  1. Find all comment nodes 查找所有评论节点
  2. For each comment extract the data attribute 对于每个注释,提取数据属性
  3. Insert a new node with the data after the current comment node 在当前注释节点之后插入包含数据的新节点
  4. Delete the comment node 删除注释节点

Have a look at this code: 看看这段代码:

 public class UncommentComments {
        public static void main(String... args) {
            String htmlIn = "<html><head></head><body>"
                    + "<!--<div> hello there </div>-->"
                    + "<div>not a comment</div>"
                    + "<!-- <h5>another comment</h5> -->" 
                    + "</body></html>";
            Document doc = Jsoup.parse(htmlIn);
            List<Comment> comments = findAllComments(doc);
            for (Comment comment : comments) {
                String data = comment.getData();
                comment.after(data);
                comment.remove();
            }
             System.out.println(doc.toString());
        }

        public static List<Comment> findAllComments(Document doc) {
            List<Comment> comments = new ArrayList<>();
            for (Element element : doc.getAllElements()) {
                for (Node n : element.childNodes()) {
                    if (n.nodeName().equals("#comment")){
                        comments.add((Comment)n);
                    }
                }
            }
            return Collections.unmodifiableList(comments);
        }
    }

Given this html document: 给出这个HTML文档:

<html>
  <head></head>
  <body>
    <!--<div> hello there </div>-->
    <div>not a comment</div>
    <!-- <h5>another comment</h5> --> 
  </body>
</html>

Will result in this output: 将导致此输出:

<html>
  <head></head>
  <body>
    <div>hello there</div>
    <div>not a comment</div> 
    <h5>another comment</h5> 
  </body>
</html>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM