简体   繁体   English

Java:将long型数组有效地转换为字节型数组

[英]Java: Efficiently converting an array of longs to an array of bytes

I have an array of longs I want to write to disk. 我有多个要写入磁盘的longs数组。 The most efficient disk I/O functions take in byte arrays, for example: 最有效的磁盘I / O功能采用字节数组,例如:

FileOutputStream.write(byte[] b, int offset, int length)

...so I want to begin by converting my long[] to byte[] (8 bytes for each long ). ...所以我想先将long[]转换为byte[] (每个long 8个字节)。 I'm struggling to find a clean way to do this. 我正在努力寻找一种干净的方法来做到这一点。

Direct typecasting doesn't seem allowed: 似乎不允许直接类型转换:

ConversionTest.java:6: inconvertible types
found   : long[]
required: byte[]
    byte[] byteArray = (byte[]) longArray;
                            ^

It's easy to do the conversion by iterating over the array, for example: 通过遍历数组很容易进行转换,例如:

ByteBuffer bytes = ByteBuffer.allocate(longArray.length * (Long.SIZE/8));
for( long l: longArray )
{
    bytes.putLong( l );
}
byte[] byteArray = bytes.array();

...however that seems far less efficient than simply treating the long[] as a series of bytes. ...但是,这似乎远比简单地将long []视为一系列字节有效。

Interestingly, when reading the file, it's easy to "cast" from byte[] to longs using Buffers: 有趣的是,在读取文件时,很容易使用Buffers将byte[]为long:

LongBuffer longs = ByteBuffer.wrap(byteArray).asLongBuffer();

...but I can't seem to find any functionality to go the opposite direction. ...但是我似乎找不到找到相反方向的功能。

I understand there are endian considerations when converting from long to byte , but I believe I've already addressed those: I'm using the Buffer framework shown above, which defaults to big endian, regardless of native byte order. 我了解从longbyte转换时有字节顺序的注意事项,但我相信我已经解决了这些问题:我正在使用上面显示的Buffer框架,无论本机字节顺序如何,该框架默认为big endian。

No, there is not a trivial way to convert from a long[] to a byte[] . 不,没有简单的方法可以将long[]转换为byte[]

Your best option is likely to wrap your FileOutputStream with a BufferedOutputStream and then write out the individual byte values for each long (using bitwise operators). 最好的选择可能是用BufferedOutputStream包裹FileOutputStream ,然后写出每个long的单个byte值(使用按位运算符)。

Another option is to create a ByteBuffer and put your long values into the ByteBuffer and then write that to a FileChannel . 另一个选择是创建一个ByteBuffer并将您的long值放入ByteBuffer ,然后将其写入FileChannel This handles the endianness conversion for you, but makes the buffering more complicated. 这可以为您处理字节顺序转换,但会使缓冲更加复杂。

Concerning the efficiency, many details will, in fact, hardly make a difference. 关于效率,事实上,许多细节几乎没有什么不同。 The hard disk is by far the slowest part involved here, and in the time that it takes to write a single byte to the disk, you could have converted thousands or even millions of bytes to longs. 硬盘是迄今为止涉及到的最慢的部分,而且在将单个字节写入磁盘所需的时间中,您可能已经将数千乃至数百万个字节转换为long。 Every performance test here will not tell you anything about the performance of the implementation , but about the performance of the hard disk . 每一个性能测试在这里也不会告诉你的执行表现什么,但对硬盘的性能。 In doubt, one should make dedicated benchmarks comparing the different conversion strategies, and comparing the different writing methods, respectively. 毫无疑问,应该制定专门的基准,分别比较不同的转换策略和不同的书写方式。

Assuming that the main goal is a functionality that allows a convenient conversion and does not impose an unnecessary overhead, I'd like to propose the following approach: 假设主要目标是一种允许方便转换且不会造成不必要开销的功能,我想提出以下方法:

One can create a ByteBuffer of sufficient size, view this as a LongBuffer , use the bulk LongBuffer#put(long[]) method (which takes care of endianness conversions, of necessary, and does this as efficient as it can be), and finally, write the original ByteBuffer (which is now filled with the long values) to the file, using a FileChannel . 可以创建一个足够大的ByteBuffer ,将其视为LongBuffer ,使用bulk LongBuffer#put(long[])方法(该方法需要进行字节序转换,并尽可能有效地做到这一点),以及最后,使用FileChannel将原始ByteBuffer (现在已用long值填充)写入文件。

Following this idea, I think that this method is convenient and (most likely) rather efficient: 遵循这个想法,我认为这种方法很方便并且(很可能)相当有效:

private static void bulkAndChannel(String fileName, long longArray[]) 
{
    ByteBuffer bytes = 
        ByteBuffer.allocate(longArray.length * Long.BYTES);
    bytes.order(ByteOrder.nativeOrder()).asLongBuffer().put(longArray);
    try (FileOutputStream fos = new FileOutputStream(fileName))
    {
        fos.getChannel().write(bytes);
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
}

(Of course, one could argue about whether allocating a "large" buffer is the best idea. But thanks to the convenience methods of the Buffer classes, this could easily and with reasonable effort be modified to write "chunks" of data with an appropriate size, for the case that one really wants to write a huge array and the memory overhead of creating the corresponding ByteBuffer would be prohibitively large) (当然,人们可能会争论分配“大”缓冲区是否是最好的主意。但是,由于使用了Buffer类的便捷方法,可以很容易地并通过合理的努力对其进行修改,以通过适当的方式写入数据的“块”。大小,对于真的要写一个数组的情况,创建相应的ByteBuffer的内存开销会非常大)

OP here. 在这里操作。

I have thought of one approach: ByteBuffer.asLongBuffer() returns an instance of ByteBufferAsLongBufferB , a class which wraps ByteBuffer in an interface for treating the data as long s while properly managing endianness. 想到一个办法: ByteBuffer.asLongBuffer()返回的实例ByteBufferAsLongBufferB ,它包装的ByteBuffer在一个接口作为处理数据的类long s,而正确管理字节顺序。 I could extend ByteBufferAsLongBufferB , and add a method to return the raw byte buffer (which is protected ). 可以扩展ByteBufferAsLongBufferB ,并添加一个方法来返回原始字节缓冲区( protected )。

But this seems so esoteric and convoluted I feel there must be an easier way. 但这似乎深奥而令人费解,我觉得必须有一种更简单的方法。 Either that, or something in my approach is flawed. 要么是这样,要么是我的方法存在缺陷。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM