简体   繁体   中英

Umlaut in Java SAX Parser

I am currently having trouble with German umlaut values in a XML document I received.

It displays / saves the value as a " ü " instead of a " ü ".

The XML Encoding is set to UTF-8 which should be capable of displaying umlauts.

Also I couldn't find any option to set a locale on the SAX parser.

Is there any other way I can make the values save correctly?

btw: I am using eclipse as IDE.

All help is very appreciated!

Thanks in advance!

The XML is encoded in UTF-8, but you are decoding it with ISO-8859-1.

Try to use InputStream and other "binary"-oriented APIs for XML. Avoid using a Reader , or trying to convert from byte[] to a String before parsing XML. You are much more likely to mess up the character encoding than the parser is.

在XML声明中将XML编码设置为UTF-8是一回事,但另一方面是XML文档的物理编码,即,您可以有一个XML文件,其内容为<?xml version="1.0" encoding="utf-8"?>但文件本身仍可能是ANSI编码(或其他格式)。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM