简体   繁体   English

Java URLEncode给出了不同的结果

[英]Java URLEncode giving different results

I have this code stub: 我有这个代码存根:

System.out.println(param+"="+value);
param = URLEncoder.encode(param, "UTF-8");
value = URLEncoder.encode(value, "UTF-8");
System.out.println(param+"="+value);

This gives this result in Eclipse: 这在Eclipse中给出了这个结果:

p=指甲油
p=%E6%8C%87%E7%94%B2%E6%B2%B9

But when I run the same code from command line, I get the following output: 但是当我从命令行运行相同的代码时,我得到以下输出:

p=指甲油
p=%C3%8A%C3%A5%C3%A1%C3%81%C3%AE%E2%89%A4%C3%8A%E2%89%A4%CF%80

What could be the problem? 可能是什么问题呢?

Your Mac was using Mac OS Roman encoding in the terminal. 您的Mac在终端中使用Mac OS Roman编码。 Those Chinese characters are incorrectly been interpreted using Mac OS Roman encoding instead of UTF-8 encoding before sending to Java. 在发送到Java之前,使用Mac OS Roman编码而不是UTF-8编码错误地解释了这些中文字符。

As evidence, those Chinese characters exist in UTF-8 encoding of the following (hex) bytes: 作为证据,这些中文字符以下列(十六进制)字节的UTF-8编码存在:

Then check the Mac OS Roman codepage layout , those (hex) bytes represent the following characters: 然后检查Mac OS Roman代码页布局 ,那些(十六进制)字节代表以下字符:

  • 0xE6 0x8C 0x87 = Ê å á 0xE6 0x8C 87H的= Ê å á
  • 0xE7 0x94 0xB2 = Á î 0xE7 0x94之间0xB2 = Á î
  • 0xE6 0xB2 0xB9 = Ê π 0xE6 0xB2 0xB9 = Ê π

Now, put them together and URL-encode them using UTF-8: 现在,将它们放在一起并使用UTF-8对它们进行URL编码:

System.out.println(URLEncoder.encode("指甲油", "UTF-8"));

Look what it prints? 看看它打印的是什么?

%C3%8A%C3%A5%C3%A1%C3%81%C3%AE%E2%89%A4%C3%8A%E2%89%A4%CF%80

To fix your problem, tell your Mac to use UTF-8 encoding in the terminal. 要解决您的问题,请告诉您的Mac在终端中使用UTF-8编码。 Honestly, I can't answer that part off top of head as I don't do Mac. 老实说,我不能回答那个部分,因为我不做Mac。 Your Eclipse encoding configuration is totally fine, but for the case that, you could configure it via Window > Preferences > General > Workspace > Text File Encoding . 您的Eclipse编码配置完全没问题,但是对于这种情况,您可以通过Window> Preferences> General> Workspace> Text File Encoding进行配置


Update : I missed a comment: 更新 :我错过了评论:

I am reading the value from a text file 我正在从文本文件中读取值

If those variables are originating from a text file instead of from commandline input — as I initially expected —, then you need to solve the problem differently. 如果这些变量来自文本文件而不是命令行输入 - 正如我最初的预期 - 那么你需要以不同的方式解决问题。 Apparently, you was using a Reader implementation for that which is using the runtime environment's default character encoding like so: 显然,您使用的是Reader实现,它使用运行时环境的默认字符编码,如下所示:

Reader reader = new FileReader("/file.txt");
// ...

You should instead be explicitly specifying the desired encoding while creating the reader. 您应该在创建阅读器时明确指定所需的编码。 You can do that with the InputStreamReader constructor. 您可以使用InputStreamReader构造函数执行此操作。

Reader reader = new InputStreamReader(new FileInputStream("/file.txt"), "UTF-8");
// ...

This will explicitly tell Java to read /file.txt using UTF-8 instead of runtime environment's default encoding as available by Charset#defaultCharset() . 这将明确告诉Java使用UTF-8读取/file.txt而不是Charset#defaultCharset()提供的运行时环境的默认编码。

System.out.println("This runtime environment uses as default charset " + Charset.defaultCharset());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM