使用节点访问者时，如何获得两个节点之间的不间断空格？

Question

I try to parse the following HTML source code:我尝试解析以下 HTML 源代码：

<a href="./">Home</a>&nbsp;&nbsp;&nbsp;
<a href="http://gouessej.wordpress.com/tag/tuer/">Blog</a>&nbsp;&nbsp;&nbsp;

I implement the interface org.jsoup.select.NodeVisitor .我实现了接口org.jsoup.select.NodeVisitor 。 However, it seems to skip the content between </a> and <a .但是，它似乎跳过了</a>和<a之间的内容。 Disabling the pretty printing doesn't solve my problem.禁用漂亮的打印并不能解决我的问题。

You can run the first JUnit test to reproduce this bug: https://github.com/gouessej/HtmlFlow/blob/patch-1/src/test/java/htmlflow/flowifier/test/TestFlowifier.java It converts the HTML source code of my homepage into Java source code, it converts this Java source code back to HTML and it compares the resulting HTML source code to the original source code. You can run the first JUnit test to reproduce this bug: https://github.com/gouessej/HtmlFlow/blob/patch-1/src/test/java/htmlflow/flowifier/test/TestFlowifier.java It converts the HTML source code of my homepage into Java source code, it converts this Java source code back to HTML and it compares the resulting HTML source code to the original source code.

PS: Actually TextNode.getWholeText() returns \n instead of    \n . PS：实际上TextNode.getWholeText()返回\n而不是   \n 。

Answer 1

TextNode.getWholeText() returns some unescaped text, I just need to escape it by calling Entities.escape(TextNode.getWholeText()) . TextNode.getWholeText()返回一些未转义的文本，我只需要通过调用Entities.escape(TextNode.getWholeText())来转义它。

使用节点访问者时，如何获得两个节点之间的不间断空格？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-11-06 21:24:02

使用节点访问者时，如何获得两个节点之间的不间断空格？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-11-06 21:24:02

解决方案1
1 已采纳 2019-11-06 21:24:02