[英]Java IO: Reading a file that is still being written
I am creating a program which needs to read from a file that is still being written. 我正在创建一个程序,该程序需要从仍在写入的文件中读取。
The main question is this: If the read and write will be performed using InputStream
and OutputStream
classes running on a separate thread, what are the catches and edge cases that I will need to be aware of in order to prevent data corruption? 主要问题是:如果将使用在单独线程上运行的
InputStream
和OutputStream
类执行读取和写入操作, 那么为了防止数据损坏,我需要注意哪些catch和edge情况?
In case anyone is wondering if I have considered other, non- InputStream
based approach, the answer is yes, I have but unfortunately it's not possible in this project since the program uses libraries that only works with InputStream
and OutputStream
. 如果有人想知道我是否考虑过其他基于非
InputStream
的方法,答案是肯定的,但不幸的是,在此项目中这是不可能的,因为该程序使用仅适用于InputStream
和OutputStream
。
Also, several readers have asked why this complications is necessary. 另外,一些读者问为什么需要这种并发症。 Why not perform reading after the file has been written completely?
文件完全写入后,为什么不执行读取?
The reason is efficiency. 原因是效率。 The program will perform the following
该程序将执行以下操作
If we are to wait for the whole file to be transferred, it will introduce a huge delay on what is supposed to be a real-time system. 如果我们要等待整个文件都被传输,这将对应该是实时系统的系统造成巨大的延迟。
Use a RandomAccessFile . 使用RandomAccessFile 。 Via a getChannel or such one could use a ByteBuffer .
通过getChannel或此类可以使用ByteBuffer 。
You will not be able to "insert" or "delete" middle parts of the file. 您将无法“插入”或“删除”文件的中间部分。 For such a purpose your original approach would be fine, but using two files.
为此,您的原始方法会很好,但是使用两个文件。
For concurrency: to keep in synch you could maintain one single object model of the file, do changes there. 并发性:为了保持同步,您可以维护文件的一个对象模型,然后在其中进行更改。 Only the pending changes need to be kept in memory, other hierarchical data could be reread and reparsed as needed.
只有待处理的更改需要保留在内存中,其他层次结构数据可以根据需要重新读取和重新解析。
You should use PipedInputStream and PipedOutputStream: 您应该使用PipedInputStream和PipedOutputStream:
static Thread newCopyThread(InputStream is, OutputStream os) {
Thread t = new Thread() {
@Override
public void run() {
byte[] buffer = new byte[2048];
try {
while (true) {
int size = is.read(buffer);
if (size < 0) break;
os.write(buffer, 0, size);
}
is.close();
os.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
}
}
};
return t;
}
public void main(String[] args) throws IOException, InterruptedException {
ByteArrayInputStream bi = new ByteArrayInputStream("abcdefg".getBytes());
PipedInputStream is = new PipedInputStream();
PipedOutputStream os = new PipedOutputStream(is);
Thread p = newCopyThread(bi, os);
Thread c = newCopyThread(is, System.out);
p.start();
c.start();
p.join();
c.join();
}
So your problem (as you've cleared it up now) is that you can't start processing until chunk#1 has arrived, and you need to buffer every chunk#N (N > 1) until you can process them. 因此,您的问题(如您现在清除的问题)是,直到chunk#1到达后才能开始处理,并且需要缓冲每个chunk#N(N> 1),直到可以处理它们为止。
I would write each chunk to their own file and create a custom InputStream
that will read every chunk in order. 我将每个块写入其自己的文件,并创建一个自定义
InputStream
,它将按顺序读取每个块。 While downloading the chunkfile would be named something like chunk.1.downloading
and when the whole chunk is loaded it will be renamed to chunk.1
. 在下载块文件时,它将被命名为
chunk.1
类的chunk.1.downloading
并且在加载整个块时,会将其重命名为chunk.1
。
The custom InputStream
will check to see if file chunk.N
exists (where N = 1...X). 自定义
InputStream
将检查文件chunk.N
存在chunk.N
(其中N = 1 ... X)。 If not, it will block. 如果没有,它将阻塞。 Each time a chunk has been downloaded completely, the
InputStream
is notified, it will check if the downloaded chunk was the next one to be processed. 每次完全下载一个块时,都会通知
InputStream
,它会检查下载的块是否是下一个要处理的块。 If yes, read as normally, otherwise block again. 如果是,请正常阅读,否则再次阻止。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.