在不使用String或Charset的情况下，将数组字符串转换为UTF-8中的字节数组

Question

I have a little question. 我有一个小问题。 I have to encode char array with UTF-8 and get the byte array equivalent of it by using Java. 我必须使用UTF-8对char数组进行编码，并使用Java获取与其相当的字节数组。 Converting the char array to String and than getting the byte array is not an option, String must be avoided, because of security concerns. 将char数组转换为String而不是获取字节数组不是一个选项，因为安全问题，必须避免使用String。 If I use 如果我使用

byte[] encoded = Charset.forName("UTF-8").encode(CharBuffer.wrap(toBeEncoded)).array();

When the length of the input array is more than 9 symbols, the output array has an extra element which is empty. 当输入数组的长度超过9个符号时，输出数组有一个空的额外元素。 If the length is even longer, there are more empty elements. 如果长度更长，则有更多空元素。 Then I decode it, I get extra extra more elements. 然后我解码它，我得到额外的额外元素。 If after encoding I have 1 empty element, after decoding there are two. 如果编码后我有1个空元素，解码后有两个。 This is not an option too, because I want to encrypt the encoded value. 这也不是一个选项，因为我想加密编码值。 Thank you. 谢谢。

Answer 1

The problem is that Charset.encode() makes no guarantees about the capacity of the buffer it returns. 问题是Charset.encode()不保证它返回的缓冲区的容量。 It very well might allocate extra space at the end, which is what you are seeing. 它很可能会在最后分配额外的空间，这就是你所看到的。 However, the buffer's limit will be set correctly. 但是，将正确设置缓冲区的限制。 In fact, there is no guarantee that the returned buffer will be backed by an array at all (it could be made a direct buffer in future Java versions, who knows?) 实际上，无法保证返回的缓冲区完全由数组支持（它可以在未来的Java版本中成为直接缓冲区，谁知道？）

To get a properly sized array you'll need to make a properly sized byte array and copy only the data you want from the byte buffer into that array. 要获得正确大小的数组，您需要制作一个大小合适的字节数组，并将所需的数据从字节缓冲区复制到该数组中。 Here we use the limit (which is the amount of content actually written into the buffer) to size the new array: 这里我们使用限制（实际写入缓冲区的内容量）来调整新数组的大小：

ByteBuffer buf = StandardCharsets.UTF_8.encode(CharBuffer.wrap(toBeEncoded));
byte[] array = new byte[buf.limit()];
buf.get(array);

This article describes the limit, capacity and position of buffers nicely. 本文很好地描述了缓冲区的限制，容量和位置。

在不使用String或Charset的情况下，将数组字符串转换为UTF-8中的字节数组

问题描述

1 个解决方案

解决方案1
10 已采纳 2015-11-18 22:47:34

在不使用String或Charset的情况下，将数组字符串转换为UTF-8中的字节数组

问题描述

1 个解决方案

解决方案1 10 已采纳 2015-11-18 22:47:34

解决方案1
10 已采纳 2015-11-18 22:47:34