简体   繁体   English

子进程中的间歇性文件访问错误

[英]Intermittent file access errors in child processes

I am developing a sort of compiler launcher whose basic sequence of operation is as follows: 我正在开发一种编译器启动器,其基本操作顺序如下:

  1. Accept a batch of source files over a network connection and write them to a directory on the local file system. 通过网络连接接受一批源文件,并将它们写入本地文件系统上的目录。
  2. Execute a number of compiler launches (as child processes) over these files. 在这些文件上执行许多编译器启动(作为子进程)。
  3. Collect files produced by the compilers and send them back. 收集编译器生成的文件,然后将其发送回去。
  4. Optionally, clean up. (可选)清理。

Although compiler launches are parallelized, steps 1, 2, 3, 4 are done in a strictly sequential fashion and do not overlap. 尽管编译器启动是并行的,但是步骤1、2、3、4是以严格顺序的方式完成的,并且不会重叠。

The problem is, on Windows Server 2008 R2 Enterprise, compilers make intermittent complaints about some files being missing or permission denied, eg: 问题是,在Windows Server 2008 R2 Enterprise上,编译器间歇性地抱怨某些文件丢失或权限被拒绝,例如:

some_file1.h(20) : fatal error C1083: Cannot open include file: 'some_file1.h': No such file or directory

or: 要么:

c1xx : fatal error C1083: Cannot open source file: 'some_file3.cpp': Permission denied

Usually, a failure pertaining to a given file is repeated across several compiler launches. 通常,与给定文件有关的故障会在多次编译器启动中重复出现。 I never experience these failures on the development machine. 我从未在开发机器上遇到过这些故障。

All files compilers complain are not there actually are when I check afterwards. 我随后检查时,编译器抱怨的所有文件实际上都不存在。 I also keep a full log of compiler launches, including command lines, start directories and environment variables. 我还保留了完整的编译器启动日志,包括命令行,启动目录和环境变量。 When I rerun them manually, they run fine. 当我手动重新运行它们时,它们运行良好。

What all this looks like, is that the operating system does some caching of file system data in such a fashion that not all freshest data is always available to other processes (including children). 这看起来是因为操作系统以某种方式缓存文件系统数据,使得并非所有最新数据始终可用于其他进程(包括子进程)。

The code that writes a file looks like this: 写入文件的代码如下所示:

bool Session::receive_file(
    unsigned long long file_size,
    std::string const& full_path)
{
    std::ofstream ofs(full_path, std::ios::binary | std::ios::trunc);

    if (!ofs)
    {
        skip_input(file_size);
        return false;
    }

    char buf[4096];

    while (file_size)
    {
        std::streamsize s = sizeof buf;
        if (static_cast<unsigned long long>(s) > file_size)
            s = file_size;

        read_buffer(buf, s);

        file_size -= s;

        if (!(ofs.write(buf, s)))
        {
            skip_input(file_size);
            return false;
        }
    }

    ofs.close();

    if (!ofs)
        return false;

    return true;
}

I tried to reimplement it with fopen / fwrite / fclose and CreateFile / WriteFile / CloseHandle but to no avail. 我试图用fopen / fwrite / fcloseCreateFile / WriteFile / CloseHandle重新实现它,但无济于事。

I also tried to open the freshly-written file for reading at the end of this function in hope that it would bring the OS to senses or will help diagnose file access problems. 我还尝试在此功能结尾处打开新编写的文件以进行读取,以希望它能使操作系统正常运行或有助于诊断文件访问问题。 Nothing changed; 没有改变; my own process always opened and read the file successfully but child processes still experienced intermittent failures. 我自己的进程始终打开并成功读取文件,但是子进程仍然遇到间歇性故障。

A delay of 250 ms inserted before spawning the compilers seems to seriously reduce the frequency of errors but does not eliminate them completely (and anyway 250 ms is hell too much because I've got to handle interactive requests). 在产生编译器之前插入250 ms的延迟似乎会严重降低错误的发生率,但并不能完全消除错误(无论如何,由于我必须处理交互式请求,所以250 ms实在是太糟糕了)。

I also experience a similar trouble with the cleanup step: when removing files generated by compilers I get "file in use" errors. 我在清理步骤中也遇到了类似的麻烦:删除编译器生成的文件时,出现“文件正在使用”错误。 That's of lesser priority to me, however. 但是,这对我来说没有那么重要。

I can't believe that it is something unique what I am doing. 我无法相信这是我正在做的独特事情。 Actually, I am now reimplementing in C++ a server that was originally written in Perl. 实际上,我现在正在用C ++重新实现最初用Perl编写的服务器。 The Perl version suffered from certain stability problems - but not this particular problem. Perl版本遇到某些稳定性问题,但没有遇到此特定问题。 That means there are ways to keep file system and child processes in sync. 这意味着有多种方法可以使文件系统和子进程保持同步。

Something I must be doing wrong. 我一定做错了。 What is it? 它是什么?

As described in this MSDN article, file reads and writes are in general cached by the operating system. 如此 MSDN文章中所述,文件读取和写入通常由操作系统缓存。 One can turn the caching off by passing the FILE_FLAG_NO_BUFFERING parameter to the CreateFile method. 可以通过将FILE_FLAG_NO_BUFFERING参数传递给CreateFile方法来关闭缓存。

If one does not have access to the file creation process, it is possible to tell the operating system to flush the file cache for a specific file by calling FlushFileBuffers . 如果没有访问文件创建过程的权限,则可以通过调用FlushFileBuffers告诉操作系统为特定文件刷新文件缓存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM