简体   繁体   中英

character encoding in java web project

I just get a strange encoding problem in java web project.

System.out.println("search url: " + searchURL);    
searchURL = new String(searchURL.getBytes("utf-8"), "utf-8");
System.out.println("test===" + new String(searchURL.getBytes("utf-8")));

I test the code above in java main function, and in chinese character it works all right.

output:
search url: https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0

test===https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0

But when runs this code in tomcat.

output:
search url: https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27机器 猫%27&$format=json&$skip=0

test===https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query=%27鏈哄櫒 鐚?27&$format=json&$skip=0

then i test this in tomcat:

searchURL = new String(searchURL.getBytes("utf-8"), "utf-8");
System.out.println(new String(searchURL.getBytes("gbk"));
System.out.println(new String(searchURL.getBytes("gb2312"));

both above is ok. so why ? Any suggestion will be appreciated, really thx !

the default charset will be different between your jvm and the tomcat jvm

try

System.out.println(Charset.defaultCharset());

this will use the default charset to encode the string which may or may not be utf-8

System.out.println("test===" + new String(searchURL.getBytes("utf-8")));

so while the byte array is utf-8 the decoder may expect something else.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM