简体   繁体   English

为什么dom4j文档对象将XML EOL \\ r \\ n转换为\\ n

[英]Why dom4j document object convert XML EOL \r\n to \n

I am parsing xml string using DOM4J jar (I have tried 1.6.1 & 2.0.2) below is my sample code 我正在使用DOM4J jar(我尝试过1.6.1和2.0.2)解析xml字符串,下面是我的示例代码

SAXReader reader = new SAXReader();
InputSource inputSource = new InputSource(new StringReader("<root xml:space='preserve'>\r\n<emp>\r\n<name>raj</name>\r\n</emp>\r\n</root>"));
Document document = null;

try {
    document = reader.read(inputSource);
} catch (DocumentException e1) {
    e1.printStackTrace();
}       
String st = document.asXML(); //When I debug I can see below value in this st variable
//<root xml:space='preserve'>\n<emp>\n<name>raj</name>\n</emp>\n</root>

Why its coverting XML EOL (End of Line) from \\r\\n to \\n ? 为什么覆盖从\\ r \\ n到\\ n的XML EOL(行尾)?

If I want to preserve same EOL as "\\r\\n", Is there any option available ? 如果我想保留与“ \\ r \\ n”相同的EOL,是否有可用的选项?

Mandated by the specification : 规范规定

To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character. 为了简化应用程序的任务,在解析之前,XML处理器必须通过转换两个字符序列#xD #xA和任何#来规范输入时外部解析实体(包括文档实体)中的所有换行符,以使其规范化。 xD,后跟#xA而不是单个#xA字符。

You can set the line separator used when writing XML documents: 您可以设置在编写XML文档时使用的行分隔符:

OutputFormat#setLineSeparator(String)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM