[英]Java XMLStreamReader converts " to "
Suppose, we have the following XML 假设我们有以下XML
<Test> <Description> "Hi" </Description> </Test>
I load this XML using XMLStreamReader and parse using the reader object. 我使用XMLStreamReader加载此XML,并使用reader对象进行解析。 When I print the characters encountered while parsing using the getText() of the reader, I see that the
"
当使用阅读器的getText()打印在解析时遇到的字符时,我看到
"
is printed as ". Although, "(double-quotes) need not have been escaped to "
尽管“(双引号)不必转义为
"
in the first place, I would like to know why the parser automatically does this conversion when the escaping is not required. 首先,我想知道为什么解析器在不需要转义时会自动进行此转换。 For instance,
<, > and &
例如,
<, > and &
<, > and &
are preserved, without which the resulting XML would be invalid. 被保留,否则,生成的XML将无效。 However, this is not the case for
" and '
但是,“情况
" and '
并非如此" and '
" and '
. 。 I have to save the description the same way I receive it.
我必须按照接收说明的相同方式保存说明。 Is it possible to do that with the XMLStreamReader API?
使用XMLStreamReader API可以做到这一点吗?
I have to save the description the same way I receive it.
我必须按照接收说明的相同方式保存说明。
You should not. 你不应该。 As far as XML is concerned,
"
就XML而言,
"
or "
are the exact same thing, and therefore it cannot matter to you whether you obtain one or the other. 或
"
是完全相同的事物,因此,无论您获得一件还是另一件都无所谓。
As for why it's happening, it is an XML parser's job to unescape escaped characters so that they present you with the data they mean. 至于发生这种情况的原因,这是XML解析器的工作,它可以对转义的字符进行转义,以便它们为您提供所要表示的数据。 It also unescapes
<
它也不会转义
<
and so on. 等等。 However, when the text such obtained is then serialized back into XML, the serializer will escape again characters such as
<
because it's required by XML, but it won't bother escaping "
because that's not necessary. 但是,当将这样获得的文本序列化回XML时,序列化程序将再次转义诸如
<
之类的字符,因为XML要求它,但不会麻烦转义"
因为这不是必需的。
When you go through a process of parsing XML, then serializing again, you cannot have a concept of "preserving" the escapes as-is. 当您经历解析XML的过程,然后再次进行序列化时,您将无法拥有“保留”原义转义的概念。 That's inherently lost in conversion.
这是转换中固有的损失。 The parser just is not in charge of preserving this unneeded info.
解析器只是不负责保留此不需要的信息。 However, if you wish your
"
to always be escaped to "
in the resulting XML, your XML serializer might have an option for that (you gave no details about what you're using, so I can't tell you definitely whether you can or cannot.) 但是,如果你希望你的
"
总是被逃到"
在生成的XML,你的XML序列化可能对于一个选项(你没有给你正在使用的细节,所以我不能告诉你,肯定你是否可以或不可以。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.