简体   繁体   English

不同的jvms与不同的编码

[英]different jvms with different encodings

Suppose I have 2 jvms running - 1 is a client and the other is a server. 假设我有2个jvms在运行 - 1是客户端而另一个是服务器。 Suppose the client and server are using different encodings. 假设客户端和服务器使用不同的编码。 If I write a program on the client which sends Strings across the network to the server, is it necessary to encode the String in the client in the server's encoding before the client sends it across to the server? 如果我在客户端编写一个程序,通过网络将Strings发送到服务器,是否有必要在客户端将其发送到服务器之前,在服务器的编码中对客户端中的String进行编码? Would this be pointless if the 2 are using different encodings in the first place? 如果2首先使用不同的编码,这会毫无意义吗? How do clients and servers handle scenarios typically where they are exchanging messages where both are using different encodings? 客户端和服务器如何处理通常在两者都使用不同编码的情况下交换消息的情况?

I suppose you are encountering what is called platform default encoding. 我想你正在遇到所谓的平台默认编码。 For example, when converting bytes into String using new String(byte[]), the default encoding is used to convert bytes to String. 例如,使用新String(byte [])将字节转换为String时,默认编码用于将字节转换为String。 Different servers may have different setup that have a different default platform encoding. 不同的服务器可能具有不同的设置,具有不同的默认平台编码。

To prevent different behaviour of the servers due to different default encoding, specify the encoding to use when converting bytes[] to String. 要防止由于不同的默认编码导致服务器的不同行为,请指定将bytes []转换为String时要使用的编码。 If you don't know the encoding to use, that is another matter but at least you get consistent results for the same byte stream. 如果您不知道要使用的编码,那么这是另一回事,但至少可以获得相同字节流的一致结果。

For example, to convert String to UTF-8 byte stream use getBytes("UTF-8") and to get back the String, use String(byte[],"UTF-8"); 例如,要将String转换为UTF-8字节流,请使用getBytes(“UTF-8”)并返回String,使用String(byte [],“UTF-8”);

JVMs always use UTF in String s (read this answer ). JVM总是在String使用UTF(阅读本答案 )。

The critical part is the transmission of the String which is likely to happen on a byte-based stream. 关键部分是String的传输,这可能发生在基于字节的流上。 Converting a String to a byte[] actually requires you to specify the encoding. String转换为byte[]实际上需要您指定编码。 You should use utf-8 in most cases. 在大多数情况下你应该使用utf-8。

// On the client side
byte[] bytes = myString.getBytes("UTF-8");
serverStream.write(bytes);
// On the server side
byte[] bytes = /* read bytes */;
String myString = new String(bytes, "UTF-8");

I suggest using a DataOutputStream / DataInputStream which provide methods for charset-safe String transmissions. 我建议使用DataOutputStream / DataInputStream ,它为charset-safe String传输提供方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM