简体   繁体   English

从站点将xml文件下载到android,编码错误

[英]Download xml file from site to android, wrong encode

On remote http server, I create test.xml file in Sublime text2, and save it with encoding utf-8. 在远程http服务器上,我在Sublime text2中创建了test.xml文件,并使用utf-8编码进行保存。

<?xml version='1.0' encoding='Utf-8' ?>
<Shops>
 <Shop name="one"></Shop>
 <Shop name="two" ></Shop>
 <Shop name="three"></Shop>
</Shops>

Then I download it on my device: 然后,将其下载到我的设备上:

String str="";
            URL url = new URL(Server+urls);
            URLConnection ucon = url.openConnection();
            InputStream is = ucon.getInputStream();
            BufferedInputStream bis = new BufferedInputStream(is);
            ByteArrayBuffer baf = new ByteArrayBuffer(50);
            int current = 0;
            while ((current = bis.read()) != -1) 
            {
                baf.append((byte) current);
            }
            str = new String(baf.toByteArray(),"Utf-8");
            DataOutputStream out = null;
            out = new DataOutputStream(openFileOutput(filename, Context.MODE_PRIVATE));
            out.writeUTF(str);
            out.close();

After it, through DDMS file explorer I download it on my macbook, open in Sublime text2, and saw: 之后,通过DDMS文件浏览器,将其下载到Macbook上,以Sublime text2打开,并看到:

008e 3c3f 786d 6c20 7665 7273 696f 6e3d
2731 2e30 2720 656e 636f 6469 6e67 3d27
5574 662d 3827 203f 3e0d 0a3c 5368 6f70
733e 0d0a 203c 5368 6f70 206e 616d 653d
226f 6e65 223e 3c2f 5368 6f70 3e0d 0a20
3c53 686f 7020 6e61 6d65 3d22 7477 6f22
203e 3c2f 5368 6f70 3e0d 0a20 3c53 686f
7020 6e61 6d65 3d22 7468 7265 6522 3e3c
2f53 686f 703e 0d0a 3c2f 5368 6f70 733e

Then I chose reopen with encoding utf-8, and saw (By the way, I can't copy/past what I saw): 然后,我选择使用utf-8编码重新打开,然后看到(顺便说一句,我无法复制/粘贴所看到的内容):

在此处输入图片说明

I'm not really sure why you're converting from/to bytes, but this problem may be a good candidate for StAX . 我不太确定为什么要从字节转换为字节,但是这个问题可能是StAX的不错选择。

Checkout the reading and writing sections. 查看阅读和写作部分。

.writeUTF uses modified UTF-8 , not UTF-8. .writeUTF使用修改后的UTF-8 ,而不是UTF-8。

The 0x00 0x8e is big endian unsigned short for the length of the XML, which is 142 and matches the real length which is 142 as well. 0x00 0x8e是XML长度(142)的大字节序无符号缩写,并且也与142的实际长度匹配。

Use this: 用这个:

str = new String(baf.toByteArray(),"UTF-8");
OutputStreamWriter osw = new OutputStreamWriter(
    openFileOutput(filename, Context.MODE_PRIVATE),
    Charset.forName("UTF-8").newEncoder()
);
osw.write(str);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM