简体   繁体   English

将文件的一部分复制到Java中的字节数组

[英]Copying part of a File to a byte array in Java

Is there any way to read part of a file to be imported into a byte array? 有什么办法读取要导入字节数组的文件的一部分?

I would like to know how, because I can only found ways to convert the whole file into a byte array which is a very heavy memory hungry operation. 我想知道如何做,因为我只能找到将整个文件转换为字节数组的方法,这是非常耗费内存的操作。

I'd use RandomAccessFile : 我会使用RandomAccessFile

public static byte[] readFileSegment(File file, long index, int count) {
    RandomAccessFile raf = new RandomAccessFile(file, "r");
    byte[] buffer = new byte[count];
    try {
        raf.seek(index);
        raf.readFully(buffer, 0, count);
        return buffer;
    } finally {
        raf.close();
    }
}

There are other alternatives with memory mapped files, NIO etc - but this should be simple. 内存映射文件,NIO等还有其他选择-但这应该很简单。

You can use a RandomAccessFile, or a FileInputStream with skip. 您可以将RandomAccessFile或FileInputStream与skip一起使用。

My favourite is to use a memory mapped file as it only loads into memory what you use and uses almost no heap (regardless of how much you use) 我最喜欢的是使用内存映射文件,因为它仅将您使用的内容加载到内存中,并且几乎不使用堆(无论您使用多少)

This actually worked for me in the way that i needed to get the beginning part of a file up to a certain byte size: 这实际上对我有用,因为我需要使文件的开头部分达到某个字节大小:

try
{
   int myDesiredCapacity = 100000; // 100KB
   FileInputStream is = new FileInputStream(otherFile);

   FileChannel fileChannel = is.getChannel();
   ByteBuffer byteBuffer = ByteBuffer.allocate(myDesiredCapacity);

   byte[] bytes;

   int byteCount = fileChannel.read(bb);
   if(byteCount >= 0)
   {
        bb.flip();
        bytes = bb.array();
        FileOutputStream fileOutputStream = new FileOutputStream(targetFile);
        fileOutputStream.write(bytes);
        fileOutputStream.close();
        bb.clear();
    }
}

catch(IOException e)
{
    e.printStackTrace();
}

Sure - you perform roughly the same steps as you would do to read the whole file to a byte array, except that you throw away the part of the file you want instead of adding them to the array. 当然-您执行与将整个文件读取到字节数组的步骤大致相同的步骤,除了您要丢弃文件的一部分而不是将其添加到数组中。

This raises the very relevant question of how you will detect the boundaries of the section you want. 这就提出了一个非常相关的问题,即如何检测所需部分的边界。 If it's something like a particular sentinel byte (eg 11111111 ) then this is trivially easy. 如果它像一个特定的标记字节(例如11111111 ),那么这很容易。 It's even easier if you want some fixed offsets, just discard the first n bytes and then store the following m bytes. 如果您想要一些固定的偏移量,则更容易,只需丢弃前n个字节,然后存储随后的m个字节。

If however you're detecting some sort of byte sequence then things get a little more involved, as you'll have to maintain a local variable of the part that's matched so far, then discard this if it doesn't match the whole sequence or add it all to the output if the whole sequence turns out to match. 但是,如果您正在检测某种字节序列,那么事情会涉及更多,因为您必须维护到目前为止已匹配部分的局部变量,如果不匹配整个序列,则将其丢弃或如果整个序列匹配,则将其全部添加到输出中。

At the end of the day though, the concept is straightforward - read the bytes from the file as you would if you were loading all of it, but only actually put them in the output array if they're the part you want. 归根结底,这个概念很简单-如果要加载所有文件,就可以从文件中读取字节,但是如果它们是您想要的部分,则只能将它们实际放入输出数组中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM