简体   繁体   English

编码Cp1252在XML文件中是否无效?

[英]Is encoding Cp1252 invalid in an XML file?

Some XML file I ran across is failing a well-formed XML check, even though it looks well-formed to me (I might be wrong.) 我遇到的一些XML文件未能通过格式良好的XML检查,即使它看起来很好(我可能错了。)

I have reduced it to a trivial example: 我把它简化为一个简单的例子:

<?xml version="1.0" encoding="Cp1252"?>
<jnlp/>

The method being used to do the check works like this: 用于执行检查的方法如下所示:

public static boolean isWellFormedXml(InputStream inputStream) {
    try {
        XMLInputFactory inputFactory = XMLInputFactory.newInstance();
        inputFactory.setProperty(XMLInputFactory.IS_COALESCING, false);
        inputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
        XMLStreamReader reader = inputFactory.createXMLStreamReader(stream);
        try {
            // Scan through all the reader tokens to ensure everything is well formed
            while (reader.hasNext()) {
                reader.next();
            }
        } finally {
            reader.close();
        }
    } catch (XMLStreamException e) {
        // Ignore the exception
        return false;
    }

    return true;
}

The error I'm seeing is: 我看到的错误是:

javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,40]

Message: Invalid encoding name "Cp1252". 消息:无效的编码名称“Cp1252”。

Only problem is - I can breakpoint at the catch and confirm that this encoding name does resolve. 唯一的问题是 - 我可以在catch上断点并确认此编码名称确实解决了。 So what's the deal here? 那么这里的交易是什么? Does XML also restrict which encodings you're allowed to use in the prologue? XML是否也限制允许在序言中使用哪些编码?

check: 校验:

http://www.iana.org/assignments/character-sets/character-sets.xml http://www.iana.org/assignments/character-sets/character-sets.xml

i guess the encoding you're looking for COULD be windows-1252. 我想你正在寻找的编码可能是windows-1252。 Cp1252 might be a valid charset in java, but in XML, you're not supposed to use it (by that name). Cp1252可能是java中的有效字符集,但在XML中,您不应该使用它(通过该名称)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 编码cp1252 - Encoding cp1252 默认 java file.encoding 是 Cp1252 但创建的文件是 ISO-8859-1 - Default java file.encoding is Cp1252 but created file is ISO-8859-1 有没有办法使用 java 查找文件编码类型(UTF-8 或 ANSI 或 Cp1252 或其他) - Is there a way to find file encoding type (UTF-8 or ANSI or Cp1252 or others) using java 蚂蚁错误:无法映射的字符编码Cp1252 - Ant error: unmappable character for encoding Cp1252 Java,Ant错误:编码Cp1252的不可映射字符 - Java, Ant error: unmappable character for encoding Cp1252 Java FX - Cp1252 字符编码错误 - Java FX - Cp1252 Character Encoding Error 将文件从cp1252转换为utf -8 java - Convert file from Cp1252 to utf -8 java 每次启动/重新启动Eclipse时,它都会将文本文件编码更改为其他:UTF-8而不是默认值(Cp1252) - Every time I start/restart Eclipse it changes the Text File Encoding to Other: UTF-8 instead of the Default (Cp1252) 如何可靠地猜测 MacRoman、CP1252、Latin1、UTF-8 和 ASCII 之间的编码 - How to reliably guess the encoding between MacRoman, CP1252, Latin1, UTF-8, and ASCII 即使字符串采用UTF-8,电子邮件主题仍以CP1252编码设置 - EMail subject get set in CP1252 encoding even though the string is in UTF-8
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM