简体   繁体   English

保留 <br> 使用Jsoup清洁时的标签

[英]Preserving the <br> tags when cleaning with Jsoup

For the input text: 对于输入文本:

<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?

I run the following code: 我运行以下代码:

Whitelist list = Whitelist.simpleText().addTags("br");
// Some other code...
// plaintext is the string shown above
retVal = Jsoup.clean(plaintext, StringUtils.EMPTY, list,
            new Document.OutputSettings().prettyPrint(false));

I get the output: 我得到的输出:

Arbit string <b>of</b>

text. <em>What</em> to <strong>do</strong> with it?

I don't want Jsoup to convert the <br> tags to line breaks, I want to keep them as-is. 我不希望Jsoup将<br>标记转换为换行符,我想保持它们的原样。 How can I do that? 我怎样才能做到这一点?

Try this: 尝试这个:

Document doc2deal = Jsoup.parse(inputText);
doc2deal.select("br").append("br"); //or append("<br>")

This is not reproducible for me. 这对我来说是不可复制的。 Using Jsoup 1.8.3 and this code: 使用Jsoup 1.8.3和以下代码:

String html = "<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?";
String cleaned = Jsoup.clean(html, 
        "", 
        Whitelist.simpleText().addTags("br"),
        new Document.OutputSettings().prettyPrint(false));
System.out.println(cleaned);

I get the following output: 我得到以下输出:

Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?

Your problem must be somewhere else I guess. 我想您的问题可能在其他地方。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM