简体   繁体   English

仅使用 Java.IO 从文件读取/写入字节

[英]Read/Write Bytes to and From a File Using Only Java.IO

How can we write a byte array to a file (and read it back from that file) in Java?我们如何将字节数组写入 Java 中的文件(并从该文件中读回)?

Yes, we all know there are already lots of questions like that, but they get very messy and subjective due to the fact that there are so many ways to accomplish this task.是的,我们都知道已经有很多这样的问题,但是由于有很多方法可以完成这项任务,因此它们变得非常混乱和主观。

So let's reduce the scope of the question:所以让我们减少问题的scope:

Domain:领域:

  • Android / Java Android / Java

What we want:我们想要什么:

  • Fast (as possible)快(尽可能)
  • Bug-free (in a rigidly meticulous way)无错误(以严格细致的方式)

What we are not doing:我们没有做什么:

  • Third-party libraries第三方库
  • Any libraries that require Android API later than 23 (Marshmallow)任何需要 Android API 晚于 23 的库(棉花糖)

(So, that rules out Apache Commons , Google Guava , Java.nio , and leaves us with good ol' Java.io ) (So, that rules out Apache Commons , Google Guava , Java.nio , and leaves us with good ol' Java.io )

What we need:我们需要的:

  • Byte array is always exactly the same (content and size) after going through the write-then-read process字节数组在经过先写后读过程后总是完全相同(内容和大小)
  • Write method only requires two arguments: File file, and byte[] data写法只需要两个arguments:File file,和byte[] data
  • Read method returns a byte[] and only requires one argument: File file Read 方法返回一个 byte[] 并且只需要一个参数:File file

In my particular case, these methods are private (not a library) and are NOT responsible for the following , (but if you want to create a more universal solution that applies to a wider audience, go for it):在我的特殊情况下,这些方法是私有的(不是库)并且不负责以下内容,(但如果您想创建一个适用于更广泛受众的更通用的解决方案,go ):

  • Thread-safety (file will not be accessed by more than one process at once)线程安全(文件一次不会被多个进程访问)
  • File being null文件为 null
  • File pointing to non-existent location文件指向不存在的位置
  • Lack of permissions at the file location文件位置缺少权限
  • Byte array being too large字节数组太大
  • Byte array being null字节数组为 null
  • Dealing with any "index," "length," or "append" arguments/capabilities处理任何“索引”、“长度”或“附加”参数/功能

So... we're sort of in search of the definitive bullet-proof code that people in the future can assume is safe to use because your answer has lots of up-votes and there are no comments that say, "That might crash if..."所以......我们正在寻找未来人们可以认为可以安全使用的最终防弹代码,因为您的答案有很多赞成票,并且没有评论说,“这可能会崩溃如果...”

This is what I have so far:这是我到目前为止所拥有的:

Write Bytes To File:将字节写入文件:

private void writeBytesToFile(final File file, final byte[] data) {
        try {
            FileOutputStream fos = new FileOutputStream(file);
            fos.write(data);
            fos.close();
        } catch (Exception e) {
            Log.i("XXX", "BUG: " + e);
        }
    }

Read Bytes From File:从文件中读取字节:

private byte[] readBytesFromFile(final File file) {
        RandomAccessFile raf;
        byte[] bytesToReturn = new byte[(int) file.length()];
        try {
            raf = new RandomAccessFile(file, "r");
            raf.readFully(bytesToReturn);
        } catch (Exception e) {
            Log.i("XXX", "BUG: " + e);
        }
        return bytesToReturn;
}

From what I've read, the possible Exceptions are:根据我的阅读,可能的例外是:

FileNotFoundException: Am I correct that this should not happen as long as the file path being supplied was derived using Android's own internal tools and/or if the app was tested properly? FileNotFoundException:我是否正确,只要提供的文件路径是使用 Android 自己的内部工具派生的和/或如果应用程序经过正确测试,这种情况就不应该发生?

IOException: I don't really know what could cause this... but I'm assuming that there's no way around it if it does. IOException:我真的不知道是什么导致了这个......但我假设如果它确实没有办法解决它。

So with that in mind... can these methods be improved or replaced, and if so, with what?所以考虑到这一点......这些方法可以改进或替换,如果可以,用什么?

The simplest, use java's Files.readAllBytes最简单的,使用java的Files.readAllBytes

Reads all the bytes from a file.从文件中读取所有字节。 The method ensures that the file is closed when all bytes have been read or an I/O error, or other runtime exception, is thrown.该方法确保在读取所有字节或抛出 I/O 错误或其他运行时异常时关闭文件。 Note that this method is intended for simple cases where it is convenient to read all bytes into a byte array.请注意,此方法适用于方便将所有字节读入字节数组的简单情况。 It is not intended for reading in large files.它不适用于读取大文件。

Files.write 文件.write

Writes bytes to a file.将字节写入文件。 The options parameter specifies how the the file is created or opened. options 参数指定文件的创建或打开方式。 If no options are present then this method works as if the CREATE, TRUNCATE_EXISTING, and WRITE options are present.如果不存在任何选项,则此方法就像存在 CREATE、TRUNCATE_EXISTING 和 WRITE 选项一样工作。 In other words, it opens the file for writing, creating the file if it doesn't exist, or initially truncating an existing regular-file to a size of 0. All bytes in the byte array are written to the file.换句话说,它打开文件进行写入,如果文件不存在则创建文件,或者最初将现有的常规文件截断为大小为 0。字节数组中的所有字节都写入文件。 The method ensures that the file is closed when all bytes have been written (or an I/O error or other runtime exception is thrown).该方法确保在写入所有字节后关闭文件(或抛出 I/O 错误或其他运行时异常)。 If an I/O error occurs then it may do so after the file has created or truncated, or after some bytes have been written to the file.如果发生 I/O 错误,那么它可能会在文件创建或截断之后发生,或者在某些字节被写入文件之后发生。 Usage example: By default the method creates a new file or overwrites an existing file.使用示例:默认情况下,该方法创建一个新文件或覆盖现有文件。 Suppose you instead want to append bytes to an existing file:假设您希望将 append 字节转换为现有文件:

 Path path =... byte[] bytes =... Files.write(path, bytes, StandardOpenOption.APPEND);

I suggest using a BufferedInputStream and a BufferedOutputStream with appropriate buffer size.我建议使用具有适当缓冲区大小的BufferedInputStreamBufferedOutputStream

Write to file写入文件

public static void write(final File file, final byte[] bytes) {
    try (final BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(file), 64 * 1024)) {
        bos.write(bytes);
    } catch (final Exception ex) {
        Log.w("XXX", "Cannot write to file: " + ex);
    }
}

Read from file从文件中读取

public static byte[] read(final File file) {
    byte[] bytes = new byte[(int) file.length()];  // can only read up to 2GB
    try (final BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file), 64 * 1024)) {
        bis.read(bytes);
    } catch (final Exception ex) {
        Log.w("XXX", "Cannot read from file: " + ex);
    }
    return bytes;
}

The buffer size is importatant for performance.缓冲区大小对性能很重要。 I used 64K buffer, but you can play around with it and choose a buffer size that fits your needs.我使用了 64K 缓冲区,但您可以使用它并选择适合您需要的缓冲区大小。 But it should be at least 8K and best to be multiple of file system block size.但它应该至少为 8K,最好是文件系统块大小的倍数。

Also be careful when catching general Exception , as it can also catch exceptions that might be indication of a larger problem that just writing/reading a file捕获一般Exception时也要小心,因为它还可以捕获可能表明仅写入/读取文件的更大问题的异常

It looks like these are going to be core utility/library methods which must run on Android API 23 or later.看起来这些将成为必须在 Android API 23 或更高版本上运行的核心实用程序/库方法。

Concerning library methods, I find it best to make no assumptions on how applications will use these methods.关于库方法,我发现最好不要假设应用程序将如何使用这些方法。 In some cases the applications may want to receive checked IOException s (because data from a file must exist for the application to work), in other cases the applications may not even care if data is not available (because data from a file is only cache that is also available from a primary source).在某些情况下,应用程序可能希望接收已检查的IOException (因为文件中的数据必须存在才能使应用程序工作),在其他情况下,应用程序甚至可能不关心数据是否不可用(因为文件中的数据只是缓存这也可以从主要来源获得)。

When it comes to I/O operations, there is never a guarantee that operations will succeed (eg user dropping phone in the toilet).当涉及到 I/O 操作时,永远无法保证操作会成功(例如,用户将手机掉在厕所里)。 The library should reflect that and give the application a choice on how to handle errors.库应该反映这一点,并让应用程序选择如何处理错误。

To optimize I/O performance always assume the "happy path" and catch errors to figure out what went wrong.为了优化 I/O 性能,始终假设“快乐路径”并捕获错误以找出问题所在。 This is counter intuitive to normal programming but essential in dealing with storage I/O.这与普通编程相反,但在处理存储 I/O 时必不可少。 For example, just checking if a file exists before reading from a file can make your application twice as slow - all these kind of I/O actions add up fast to slow your application down.例如,在读取文件之前检查文件是否存在会使您的应用程序变慢两倍 - 所有这些类型的 I/O 操作加起来很快就会减慢您的应用程序的速度。 Just assume the file exists and if you get an error, only then check if the file exists.假设文件存在,如果出现错误,则仅检查文件是否存在。

So given those ideas, the main functions could look like:因此,鉴于这些想法,主要功能可能如下所示:

public static void writeFile(File f, byte[] data) throws FileNotFoundException, IOException {
    try (FileOutputStream out = new FileOutputStream(f)) {
        out.write(data);
    }
}

public static int readFile(File f, byte[] data) throws FileNotFoundException, IOException {
    try (FileInputStream in = new FileInputStream(f)) {
        return in.read(data); 
    }
}

Notes about the implementation:实施注意事项:

  • The methods can also throw runtime-exceptions like NullPointerException s - these methods are never going to be "bug free".这些方法还可以抛出运行时异常,如NullPointerException s - 这些方法永远不会“没有错误”。
  • I do not think buffering is needed/wanted in the methods above since only one native call is done (see also here ).我认为在上述方法中不需要/不需要缓冲,因为只完成了一次本机调用(另请参见此处)。
  • The application now also has the option to read only the beginning of a file.该应用程序现在还可以选择只读取文件的开头。

To make it easier for an application to read a file, an additional method can be added.为了使应用程序更容易读取文件,可以添加一个附加方法。 But note that it is up to the library to detect any errors and report them to the application since the application itself can no longer detect those errors.但请注意,由库来检测任何错误并将其报告给应用程序,因为应用程序本身无法再检测到这些错误。

public static byte[] readFile(File f) throws FileNotFoundException, IOException {
    int fsize = verifyFileSize(f);
    byte[] data = new byte[fsize];
    int read = readFile(f, data);
    verifyAllDataRead(f, data, read);
    return data;
}

private static int verifyFileSize(File f) throws IOException {
    long fsize = f.length();
    if (fsize > Integer.MAX_VALUE) {
        throw new IOException("File size (" + fsize + " bytes) for " + f.getName() + " too large.");
    }
    return (int) f.length();
}

public static void verifyAllDataRead(File f, byte[] data, int read) throws IOException {
    if (read != data.length) {
        throw new IOException("Expected to read " + data.length 
                + " bytes from file " + f.getName() + " but got only " + read + " bytes from file.");
    }
}

This implementation adds another hidden point of failure: OutOfMemory at the point where the new data array is created.这个实现增加了另一个隐藏的失败点:OutOfMemory 在创建新数据数组的点。

To accommodate applications further, additional methods can be added to help with different scenario's.为了进一步适应应用程序,可以添加其他方法来帮助处理不同的场景。 For example, let's say the application really does not want to deal with checked exceptions:例如,假设应用程序真的不想处理已检查的异常:

public static void writeFileData(File f, byte[] data) {
    try {
        writeFile(f, data);
    } catch (Exception e) {
        fileExceptionToRuntime(e);
    }
}

public static byte[] readFileData(File f) {
    try {
        return readFile(f);
    } catch (Exception e) {
        fileExceptionToRuntime(e);
    }
    return null;
}

public static int readFileData(File f, byte[] data) {
    try {
        return readFile(f, data);
    } catch (Exception e) {
        fileExceptionToRuntime(e);
    }
    return -1;
}

private static void fileExceptionToRuntime(Exception e) {
    if (e instanceof RuntimeException) { // e.g. NullPointerException
        throw (RuntimeException)e;
    }
    RuntimeException re = new RuntimeException(e.toString());
    re.setStackTrace(e.getStackTrace());
    throw re;
}

The method fileExceptionToRuntime is a minimal implementation, but it shows the idea here.方法fileExceptionToRuntime是一个最小的实现,但它在这里展示了这个想法。

The library could also help an application to troubleshoot when an error does occur.该库还可以帮助应用程序在发生错误时进行故障排除。 For example, a method canReadFile(File f) could check if a file exists and is readable and is not too large.例如,方法canReadFile(File f)可以检查文件是否存在、是否可读且不是太大。 The application could call such a function after a file-read fails and check for common reasons why a file cannot be read.应用程序可以在文件读取失败后调用此类 function 并检查无法读取文件的常见原因。 The same can be done for writing to a file.写入文件也可以这样做。

Although you can't use third party libraries, you can still read their code and learn from their experience.尽管您不能使用第三方库,但您仍然可以阅读他们的代码并从他们的经验中学习。 In Google Guava for example, you usually read a file into bytes like this:例如,在 Google Guava 中,您通常将文件读入如下字节:

FileInputStream reader = new FileInputStream("test.txt");
byte[] result = ByteStreams.toByteArray(reader);

The core implementation of this is toByteArrayInternal .其核心实现是toByteArrayInternal Before calling this, you should check:在调用它之前,您应该检查:

  • A not null file is passed (NullPointerException)传递了一个非 null 文件 (NullPointerException)
  • The file exists (FileNotFoundException)文件存在 (FileNotFoundException)

After that, it is reduced to handling an InputStream and this where IOExceptions come from.之后,它被简化为处理 InputStream 以及 IOExceptions 的来源。 When reading streams a lot of things out of the control of your application can go wrong (bad sectors and other hardware issues, mal-functioning drivers, OS access rights) and manifest themselves with an IOException.当读取流时,应用程序无法控制的许多事情可能会出现 go 错误(坏扇区和其他硬件问题、驱动程序故障、操作系统访问权限),并以 IOException 显示自己。

I am copying here the implementation:我在这里复制实现:

private static final int BUFFER_SIZE = 8192;

/** Max array length on JVM. */
private static final int MAX_ARRAY_LEN = Integer.MAX_VALUE - 8;

private static byte[] toByteArrayInternal(InputStream in, Queue<byte[]> bufs, int totalLen)
      throws IOException {
    // Starting with an 8k buffer, double the size of each successive buffer. Buffers are retained
    // in a deque so that there's no copying between buffers while reading and so all of the bytes
    // in each new allocated buffer are available for reading from the stream.
    for (int bufSize = BUFFER_SIZE;
        totalLen < MAX_ARRAY_LEN;
        bufSize = IntMath.saturatedMultiply(bufSize, 2)) {
      byte[] buf = new byte[Math.min(bufSize, MAX_ARRAY_LEN - totalLen)];
      bufs.add(buf);
      int off = 0;
      while (off < buf.length) {
        // always OK to fill buf; its size plus the rest of bufs is never more than MAX_ARRAY_LEN
        int r = in.read(buf, off, buf.length - off);
        if (r == -1) {
          return combineBuffers(bufs, totalLen);
        }
        off += r;
        totalLen += r;
      }
    }

    // read MAX_ARRAY_LEN bytes without seeing end of stream
    if (in.read() == -1) {
      // oh, there's the end of the stream
      return combineBuffers(bufs, MAX_ARRAY_LEN);
    } else {
      throw new OutOfMemoryError("input is too large to fit in a byte array");
    }
  }

As you can see most of the logic has to do with reading the file in chunks.如您所见,大部分逻辑都与分块读取文件有关。 This is to handle situations, where you don't know the size of the InputStream, before starting reading.这是为了在开始阅读之前处理您不知道 InputStream 大小的情况。 In your case, you only need to read files and you should be able to know the length beforehand, so this complexity could be avoided.在您的情况下,您只需要读取文件并且您应该能够事先知道长度,因此可以避免这种复杂性。

The other check is OutOfMemoryException.另一项检查是 OutOfMemoryException。 In standard Java the limit is too big, however in Android, it will be a much smaller value.在标准 Java 中限制太大,但是在 Android 中,它会小得多。 You should check, before trying to read the file that there is enough memory available .在尝试读取文件之前,您应该检查是否有足够的 memory 可用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM