Jsoup.parse（String） - 不添加\ n

Question

我正在使用Jsoup 1.7.2。

當使用API Jsoup.parse(String)我看到輸出Document對象在解析的HTML中添加了換行符（文本換行符，\\ n）。

例如：輸入字符串是：

<html><body><p>aaa</p></body></html>

Document對象具有以下內容（當調用toString() ）：

<html>
 <head></head>
 <body>
  <p>aaa</p>
 </body>
</html>

我對<body>元素感興趣。 如何指示Jsoup不要用新行格式化輸出？ 我期待身體部分是： <body><p>aaa</p></body> 。

另一方面，當我有一個帶換行符的HTML時，我希望它們保持不變。

Answer 1

試着這樣做：

Document newDocument = Jsoup.parse(htmlString, StringUtils.EMPTY, Parser.htmlParser());
newDocument.outputSettings().escapeMode(EscapeMode.base);
/**
 * Need CharEncoding.US_ASCII and not UTF-8 so the special characters will be encoded properly,
 * but representation of such will change. For instance: &mdash; will be encoded as &#8212;
 */
newDocument.outputSettings().charset(CharEncoding.US_ASCII);
newDocument.outputSettings().prettyPrint(false); // this will make sure that it will not add line breaks

Answer 2

試試這個吧。 它的工作

    Document doc = Jsoup.parse(String);
    // This line will keep your Html in one line
    doc.outputSettings().prettyPrint(false);

    System.out.println(doc.html());

Jsoup.parse（String） - 不添加\ n

問題描述

2 個解決方案

解決方案1
4 已采納 2014-01-08 16:21:43

解決方案2
3 2014-01-21 08:02:25

Jsoup.parse（String） - 不添加\ n

問題描述

2 個解決方案

解決方案1 4 已采納 2014-01-08 16:21:43

解決方案2 3 2014-01-21 08:02:25

解決方案1
4 已采納 2014-01-08 16:21:43

解決方案2
3 2014-01-21 08:02:25