简体繁体 English

从多个线程fopen和fwrite到同一文件

[英]fopen and fwrite to the same file from multiple threads

原文 2018-01-04 18:56:55 8 2 c++/ c/ multithreading

This is similar but a bit different to existing questions. 这与现有问题相似，但略有不同。 Say I have many threads that open the same file but they all do their own fopen and maintain their own FILE pointer. 假设我有很多线程打开同一文件，但是它们都执行自己的fopen并维护自己的FILE指针。 a) is it necessary to lock fwrite calls if they have their own FILE ptrs? a）如果fwrite调用具有自己的文件指针，是否有必要锁定它们？ b) if it is necessary, is locking around fwrite enough or will they potentially flush at different times and end up intermingling when they flush? b）如果有必要，将其锁定在fwrite上是否足够？或者它们是否有可能在不同的时间冲洗并最终在冲洗时混合在一起？ If yes, would locking on fwrite and then fflush cover it? 如果是的话，是否可以先锁定fwrite然后使用fflush覆盖它？

2 个解决方案

This question can not be answered in the context of programming languages. 在编程语言的上下文中无法回答此问题。 As far as programming language is concerned, those file handles are completely independent objects, and whatever you do with one has no effect whatsoever on another. 就编程语言而言，这些文件句柄是完全独立的对象，并且您对一个文件所做的任何操作都不会对另一个文件产生任何影响。

The question is on the operating system - can it handle multiple write operation to the same underlying file at the same time. 问题在操作系统上-它可以同时处理对同一基础文件的多次写入操作。 In other words, are those writes atomic . 换句话说，是那些原子的 。 I can't say for all of them, but in Linux, for example, writes for less than PIPE_BUF size are atomic. 我不能对所有这些都说，但是例如在Linux中，小于PIPE_BUF大小的写入是原子的。

For the quick measure, yeah, you can put a lock around the I/O part. 是的，为了快速采取措施，您可以在I / O部件周围放一个锁。 That'd work, I guarantee it. 可以，我保证。 As for flusing I/O cache, I'd recommend not doing that. 至于分散I / O缓存，我建议不要这样做。 It's always best to let OS to handle I/O timing because kernel knows what's going on the best. 最好让OS处理I / O时序，因为内核知道最有效的方法。 You are not gonna have it in effect immediately after calling flush anyway because it's that complicated. 无论如何，您都不会在调用flush之后立即使它生效，因为那太复杂了。 Just like the other flush operations(java GC, glFlush and so on). 就像其他冲洗操作（java GC，glFlush等）一样。 If you choose to stick to this option, please be mindful of a start and an end point of the concurrent I/O op. 如果选择坚持使用此选项，请注意并发I / O操作的起点和终点。 You wouldn't want a case where the main thread closes the file and another worker thread tries to do I/O on that. 您不希望主线程关闭文件而另一个工作线程尝试对此执行I / O。

The general solution to this problem is creating a thread that handles the file exclusively. 解决此问题的一般方法是创建一个专门处理文件的线程。 If other thread should read/write from/to the file, they must ask the thread to do that for them. 如果其他线程应该从文件中读取/写入文件，则他们必须要求该线程为它们执行此操作。 This is tricky, I know. 我知道这很棘手。 You'd need to compose a simple protocol, sync mechanism, but in a nutshell, it goes like this: 您需要组成一个简单的协议，同步机制，但总的来说，它像这样：

prep a queue, a cv(condition variable), a lock. 准备一个队列，一个cv（条件变量），一个锁。 create a thread and open the file. 创建一个线程并打开文件。 Doesn't matter who opens the file 谁打开文件都没关系
The thread spawns and waits for the queue to be filled in 线程产生并等待队列填充
Other threads send a request I/O op to the thread. 其他线程向该线程发送请求I / O操作。 The request includes the data for the file and an op code. 该请求包括文件数据和操作码。
The thread handles the requests from the queue. 线程处理来自队列的请求。 This is where the real I/O happens. 这是真正的I / O发生的地方。

You could use anonymous FIFO instead of a queue. 您可以使用匿名FIFO而不是队列。 Or skip the opcode part if the file is write-only. 如果文件是只写的，则跳过操作码部分。

Unlike network I/O, modern OSes can't do file I/Os in a non-blocking manner. 与网络I / O不同，现代OS不能以非阻塞方式进行文件I / O。 So expect a significant blocking time(io wait). 因此，预计会有大量的阻塞时间（等待）。 Also, there's this problem where the queue fills up too quick and eats a lot of memory when I/O is relatively slow. 此外，还有一个问题，即当I / O相对较慢时，队列填满的速度太快并占用大量内存。 There will be a case where the whole program should wait for the I/O to complete before terminating itself. 在某些情况下，整个程序应在终止自身之前等待I / O完成。 Not much you can do about it. 您对此无能为力。 You could close the file from another thread while I/O is in progress on Linux( close() is MT-safe ), I don't know how that's gonna work on other OS. 在Linux上进行I / O时，您可以从另一个线程关闭文件（ close()是MT安全的），我不知道在其他OS上如何工作。

There are alternatives like async file I/O or overlapped I/O which involves signal handling or callbacks. 还有其他一些选择，例如异步文件I / O或重叠I / O，涉及信号处理或回调。 Using these doesn't require a creating of a thread but each has pros and cons, mostly regarding portability. 使用这些不需要创建线程，但是每个线程都有优缺点，主要涉及可移植性。