简体   繁体   English

Java:将字符串转换为ByteBuffer和来自ByteBuffer的字符串以及相关问题

[英]Java: Converting String to and from ByteBuffer and associated problems

I am using Java NIO for my socket connections, and my protocol is text based, so I need to be able to convert Strings to ByteBuffers before writing them to the SocketChannel, and convert the incoming ByteBuffers back to Strings. 我使用Java NIO进行套接字连接,我的协议是基于文本的,所以我需要能够在将字符串转换为SocketChannel之前将其转换为ByteBuffers,并将传入的ByteBuffers转换回字符串。 Currently, I am using this code: 目前,我正在使用此代码:

public static Charset charset = Charset.forName("UTF-8");
public static CharsetEncoder encoder = charset.newEncoder();
public static CharsetDecoder decoder = charset.newDecoder();

public static ByteBuffer str_to_bb(String msg){
  try{
    return encoder.encode(CharBuffer.wrap(msg));
  }catch(Exception e){e.printStackTrace();}
  return null;
}

public static String bb_to_str(ByteBuffer buffer){
  String data = "";
  try{
    int old_position = buffer.position();
    data = decoder.decode(buffer).toString();
    // reset buffer's position to its original so it is not altered:
    buffer.position(old_position);  
  }catch (Exception e){
    e.printStackTrace();
    return "";
  }
  return data;
}

This works most of the time, but I question if this is the preferred (or simplest) way to do each direction of this conversion, or if there is another way to try. 这在大多数情况下都有效,但我怀疑这是否是执行此转换的每个方向的首选(或最简单)方法,或者是否有其他方法可以尝试。 Occasionally, and seemingly at random, calls to encode() and decode() will throw a java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING_END exception, or similar, even if I am using a new ByteBuffer object each time a conversion is done. 偶尔,看似随意,对encode()decode()调用将抛出java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING_END异常或类似,即使我每次都使用新的ByteBuffer对象转换完成。 Do I need to synchronize these methods? 我需要同步这些方法吗? Any better way to convert between Strings and ByteBuffers? 在字符串和ByteBuffers之间转换的更好方法是什么? Thanks! 谢谢!

Check out the CharsetEncoder and CharsetDecoder API descriptions - You should follow a specific sequence of method calls to avoid this problem. 查看CharsetEncoderCharsetDecoder API描述 - 您应该遵循特定的方法调用序列来避免此问题。 For example, for CharsetEncoder : 例如,对于CharsetEncoder

  1. Reset the encoder via the reset method, unless it has not been used before; 通过reset方法复位编码器,除非之前没有使用过;
  2. Invoke the encode method zero or more times, as long as additional input may be available, passing false for the endOfInput argument and filling the input buffer and flushing the output buffer between invocations; 只要有额外的输入可用,为endOfInput参数传递false并填充输入缓冲区并在调用之间刷新输出缓冲区,就调用encode方法零次或多次;
  3. Invoke the encode method one final time, passing true for the endOfInput argument; 最后一次调用encode方法,为endOfInput参数传递true ; and then 然后
  4. Invoke the flush method so that the encoder can flush any internal state to the output buffer. 调用flush方法,以便编码器可以将任何内部状态刷新到输出缓冲区。

By the way, this is the same approach I am using for NIO although some of my colleagues are converting each char directly to a byte in the knowledge they are only using ASCII, which I can imagine is probably faster. 顺便说一句,这与我用于NIO的方法相同,尽管我的一些同事正在将每个字符串直接转换为一个字节,知道他们只使用ASCII,我可以想象它可能更快。

Unless things have changed, you're better off with 除非事情发生变化,否则你会更好

public static ByteBuffer str_to_bb(String msg, Charset charset){
    return ByteBuffer.wrap(msg.getBytes(charset));
}

public static String bb_to_str(ByteBuffer buffer, Charset charset){
    byte[] bytes;
    if(buffer.hasArray()) {
        bytes = buffer.array();
    } else {
        bytes = new byte[buffer.remaining()];
        buffer.get(bytes);
    }
    return new String(bytes, charset);
}

Usually buffer.hasArray() will be either always true or always false depending on your use case. 通常,buffer.hasArray()将始终为true或始终为false,具体取决于您的用例。 In practice, unless you really want it to work under any circumstances, it's safe to optimize away the branch you don't need. 在实践中,除非您真的希望它在任何情况下都能工作,否则可以安全地优化您不需要的分支。

Answer by Adamski is a good one and describes the steps in an encoding operation when using the general encode method (that takes a byte buffer as one of the inputs) Adamski的回答很好,并描述了使用通用编码方法(将字节缓冲区作为输入之一)时的编码操作中的步骤

However, the method in question (in this discussion) is a variant of encode - encode(CharBuffer in) . 但是,所讨论的方法(在本讨论中)是编码 - 编码(CharBuffer in)的变体。 This is a convenience method that implements the entire encoding operation . 这是一种实现整个编码操作便捷方法 (Please see java docs reference in PS) (请参阅PS中的java docs参考)

As per the docs, This method should therefore not be invoked if an encoding operation is already in progress (which is what is happening in ZenBlender's code -- using static encoder/decoder in a multi threaded environment). 根据文档, 如果编码操作已在进行中 ,则不应调用此方法 (这是ZenBlender代码中发生的事情 - 在多线程环境中使用静态编码器/解码器)。

Personally, I like to use convenience methods (over the more general encode/decode methods) as they take away the burden by performing all the steps under the covers. 就个人而言,我喜欢使用便利方法(通过更通用的编码/解码方法),因为它们通过执行所有步骤来消除负担。

ZenBlender and Adamski have already suggested multiple ways options to safely do this in their comments. ZenBlender和Adamski已经提出了多种方法可以在他们的评论中安全地做到这一点。 Listing them all here: 在这里列出所有:

  • Create a new encoder/decoder object when needed for each operation (not efficient as it could lead to a large number of objects). 在每个操作需要时创建一个新的编码器/解码器对象(效率不高,因为它可能导致大量对象)。 OR, 要么,
  • Use a ThreadLocal to avoid creating new encoder/decoder for each operation. 使用ThreadLocal可避免为每个操作创建新的编码器/解码器。 OR, 要么,
  • Synchronize the entire encoding/decoding operation (this might not be preferred unless sacrificing some concurrency is ok for your program) 同步整个编码/解码操作(除非为您的程序牺牲一些并发性,否则这可能不是首选)

PS PS

java docs references: java docs参考:

  1. Encode (convenience) method: http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer%29 编码(方便)方法: http//docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer%29
  2. General encode method: http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer,%20java.nio.ByteBuffer,%20boolean%29 一般编码方法: http//docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer,%20java.nio.ByteBuffer,%20boolean% 29

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM