简体   繁体   English

在多线程程序中捕获标准输出

[英]Capture stdout in multithreaded program

I've got a function I need to call from a third-party library which I can't control.我有一个需要从我无法控制的第三方库中调用的函数。 That function evaluates a command I pass in and prints its results to stdout .该函数评估我传入的命令并将其结果打印到stdout In my use case, I need to capture the results into a std::string variable (not write to a file), which I can do just fine in a single-threaded example:在我的用例中,我需要将结果捕获到 std::string 变量中(而不是写入文件),我可以在单线程示例中很好地做到这一点:

int fd[2];
pid_t pid;

char *args[] = {};
if ( pid == 0 )
{
    dup2( fd[1], STDOUT_FILENO );
    close( fd[0] );
    close( fd[1] );

    char *args[] = {};

    // This func will print the results I want to stdout, but I have no control over its code.
    festival_eval_command("(print utt2)");

    execv( args[0], args );
}

close( fd[1] );
char buffer[1000000];
ssize_t length = read( fd[0], buffer, sizeof(buffer) - 1 );
std::string RESULT = buffer;
memset(buffer, 0, sizeof buffer); // clear the buffer

// RESULT now holds the contents that would have been printed out in third_party_eval().

Some constraints/detail:一些限制/细节:

  • My program is multi-threaded, so other threads may be using stdout simultaneously (my understanding is that C++ ties the output from multiple threads into stdout )我的程序是多线程的,所以其他线程可能同时使用stdout (我的理解是 C++ 将多个线程的输出绑定到stdout
  • The third-party library is Festival , an open-source speech synthesis library written in LISP (which I have no experience in).第三方库是Festival ,一个用 LISP 编写的开源语音合成库(我没有经验)。 I'm using its C++ API by calling: festival_eval_command("(print utt2)");我通过调用来使用它的 C++ API: festival_eval_command("(print utt2)");
  • festival_eval_command appears to use stdout , not std::cout (I've tested by redirecting both in a single-threaded program and only the stdout redirection captures the output from utt2 ) festival_eval_command似乎使用stdout ,而不是std::cout (我已经通过在单线程程序中重定向两者进行了测试,并且只有stdout重定向捕获了来自utt2的输出)
  • As far as I can tell from the source, festival_eval_command doesn't allow for an alternate file descriptor.据我所知, festival_eval_command不允许使用备用文件描述符。
  • This function is only being run in one of the threads of my multithreaded program, so I'm only concerned about isolating the festival_eval_command output from the other threads' stdout .此函数仅在我的多线程程序的一个线程中运行,因此我只关心将festival_eval_command输出与其他线程的stdout隔离开来。

My question: Is there a way I can safely retrieve the just results of festival_eval_command() from stdout in a multi-threaded program?我的问题:有没有一种方法可以安全地从多线程程序中的 stdout 中检索festival_eval_command()结果? It sounds like my options are:听起来我的选择是:

  • Launch this function in a separate process, which has its own stdout .在一个单独的进程中启动这个函数,它有自己的stdout Do the IO redirection in that separate process, get the output I need and return it back to my main program process.在那个单独的进程中进行 IO 重定向,获取我需要的输出并将其返回到我的主程序进程。 Is this correct?这样对吗? How would I go about doing this?我该怎么做呢?
  • Use a mutex around the festival_eval_command .festival_eval_command周围使用互斥锁。 I don't quite understand how mutexes interact with other threads though.我不太明白互斥体如何与其他线程交互。 If I have this example:如果我有这个例子:
void do_stuff_simultaneously() {
    std::cout << "Printing output to terminal..." << std::endl;
}

// main thread
void do_stuff() {
    // launch a separate thread that may print to stdout
    std::thread t(do_stuff_simultaneously);

    // lock stdout somehow

    // redirect stdout to string variable
    festival_eval_command("(print utt2)");

    // unlock stdout
}

Does the locking of stdout prevent do_stuff_simultaneously from accessing it? stdout 的锁定是否会阻止do_stuff_simultaneously访问它? Is there a way to make stdout thread-safe like this?有没有办法像这样使 stdout 线程安全?

However, my program is multi-threaded, so other threads may be using stdout simultaneously但是,我的程序是多线程的,所以其他线程可能同时使用stdout

The outputs of threads are going to be interleaved in a fashion you cannot control.线程的输出将以您无法控制的方式交错。 Unless each thread writes its entire output using one std::cout.write (see below for why).除非每个线程使用一个std::cout.write写入其整个输出(原因见下文)。

Is there a way I can safely retrieve the just results of third_party_eval() from stdout in a multi-threaded program?有没有一种方法可以安全地从多线程程序中的stdout检索third_party_eval()结果?

Each thread must run that function in a separate process, from which you capture its stdout into a std::string s (different one for each process).每个线程必须在一个单独的进程中运行该函数,从中您可以将其stdout捕获到std::string s (每个进程不同的一个)。

Then in parent process you write that std::string into stdout with std::cout.write(s.data(), s.size()) .然后在父进程中,您使用std::cout.write(s.data(), s.size())将该std::string写入stdout std::cout.write locks a mutex (to protect itself from data race and corruption when multiple threads write into it in any way, including operator<< ), so that the output of one process is not interleaved with anything else. std::cout.write锁定互斥锁(以在多个线程以任何方式写入其中时保护自身免受数据竞争和损坏,包括operator<< ),以便一个进程的输出不会与其他任何内容交错。

Note up front: This shows why globals are often a bad idea!预先注意:这说明了为什么全局变量通常是个坏主意! Even more, library code (ie code intended for re-use in different contexts) should never use globals.更重要的是,库代码(即用于在不同上下文中重用的代码)永远不应该使用全局变量。 This is also something to tell the supplier of that code, they should fix their library to provide a version that at least takes an output filedescriptor instead of writing to stdout .这也是告诉该代码的供应商的事情,他们应该修复他们的库以提供一个至少采用输出文件描述符而不是写入stdout

Here's what I would consider doing: Move the whole function execution to a separate process.这是我会考虑做的事情:将整个函数执行移动到一个单独的进程。 That way, if multiple threads need to run it, they will start separate processes with separate outputs that they can process independently.这样,如果多个线程需要运行它,它们将启动具有单独输出的单独进程,它们可以独立处理这些输出。

An alternative way is to wrap this single function.另一种方法是包装这个单一的功能。 This wrapper does all the IO redirection and it (being a critical section) is guarded by a mutex, so that two threads invoking the wrapper will be serialized.此包装器执行所有 IO 重定向,并且它(作为关键部分)由互斥锁保护,因此调用包装器的两个线程将被序列化。 However, this has downsides, because in the meantime, that code still messes with your process' standard streams (so a stray call to output something would be mixed into the function output).但是,这也有缺点,因为与此同时,该代码仍然与您的流程的标准流混淆(因此,对输出某些内容的杂散调用会混入函数输出中)。

A second alternative is to put the function into a wrapper process who's only goal is to serialize the use of the function.第二种选择是将函数放入一个包装进程,其唯一目标是序列化函数的使用。 You'd start that process on demand or on start of your application and use some form of IPC to communicate with it.您可以按需或在应用程序启动时启动该过程,并使用某种形式的 IPC 与其进行通信。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM