简体   繁体   中英

Jsoup cleaning my html

I'm trying to learn how to use Jsoup cleaning HTML code.

I want to remove the <body> tag from this example but <p> tag must stay:

public class prb {
    public static void main(String[] args) throws Exception {
        String i = "<p>Text 1234 <body>WOW</body> Text 1234</p><p>Text 1234</p>";

        System.out.println(getStringWithoutHtmlTags(i));
    }

    public static String getStringWithoutHtmlTags(String text) {
        Whitelist asd = new Whitelist();
        asd.addTags("<p>", "</p>");
        asd.removeTags("<body>, </body>");

        return Jsoup.clean(text, asd);
    }
}

But it removes all tags. The output is:

Text 1234 WOW Text 1234 Text 1234

What am I doing wrong?

Thank you in advance.

You made a mistake on writing the tags , because asd.addTags("<p>", "</p>"); is very heavy because you have twice p and <,>,/ are useless

So as the documentation says :

asd.addTags("p");
asd.removeTags("body");

More details on tags/attributes/procotols for WhiteList : Jsoup whitelist

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM