简体   繁体   中英

Invalid byte 2 of 2-byte UTF-8 sequence: XML saved as String varible

I get the following error due to Latin text in my XML.

Invalid byte 2 of 2-byte UTF-8 sequence: XML saved as String varible

My XML is written to a String variable (I don't import a file). I tried to set encoding to "UTF-8", but I might have done it wrong.

Can you help please?

My code:

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
InputStream inputStream = new ByteArrayInputStream(GET_XML.getBytes());
Document doc = dBuilder.parse(inputStream);
doc.getDocumentElement().normalize();

You are seeing this error, because you are feeding xml containing ISO-8859-1 (aka Latin-1) characters without proper XML declaration:

<?xml version='1.0' encoding='ISO-8859-1' standalone='no' ?>

You have two options either correct it by sourcing xml with above declaration.
OR forcing UTF-8 during byte conversion.

new ByteArrayInputStream(GET_XML.getBytes(StandardCharsets.UTF_8));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM