[英]Capture stdout in multithreaded program
I've got a function I need to call from a third-party library which I can't control.我有一个需要从我无法控制的第三方库中调用的函数。 That function evaluates a command I pass in and prints its results to
stdout
.该函数评估我传入的命令并将其结果打印到
stdout
。 In my use case, I need to capture the results into a std::string variable (not write to a file), which I can do just fine in a single-threaded example:在我的用例中,我需要将结果捕获到 std::string 变量中(而不是写入文件),我可以在单线程示例中很好地做到这一点:
int fd[2];
pid_t pid;
char *args[] = {};
if ( pid == 0 )
{
dup2( fd[1], STDOUT_FILENO );
close( fd[0] );
close( fd[1] );
char *args[] = {};
// This func will print the results I want to stdout, but I have no control over its code.
festival_eval_command("(print utt2)");
execv( args[0], args );
}
close( fd[1] );
char buffer[1000000];
ssize_t length = read( fd[0], buffer, sizeof(buffer) - 1 );
std::string RESULT = buffer;
memset(buffer, 0, sizeof buffer); // clear the buffer
// RESULT now holds the contents that would have been printed out in third_party_eval().
Some constraints/detail:一些限制/细节:
stdout
simultaneously (my understanding is that C++ ties the output from multiple threads into stdout
)stdout
(我的理解是 C++ 将多个线程的输出绑定到stdout
)festival_eval_command("(print utt2)");
festival_eval_command("(print utt2)");
festival_eval_command
appears to use stdout
, not std::cout
(I've tested by redirecting both in a single-threaded program and only the stdout
redirection captures the output from utt2
) festival_eval_command
似乎使用stdout
,而不是std::cout
(我已经通过在单线程程序中重定向两者进行了测试,并且只有stdout
重定向捕获了来自utt2
的输出)festival_eval_command
doesn't allow for an alternate file descriptor.festival_eval_command
不允许使用备用文件描述符。festival_eval_command
output from the other threads' stdout
.festival_eval_command
输出与其他线程的stdout
隔离开来。 My question: Is there a way I can safely retrieve the just results of festival_eval_command()
from stdout in a multi-threaded program?我的问题:有没有一种方法可以安全地从多线程程序中的 stdout 中检索
festival_eval_command()
结果? It sounds like my options are:听起来我的选择是:
stdout
.stdout
。 Do the IO redirection in that separate process, get the output I need and return it back to my main program process.festival_eval_command
.festival_eval_command
周围使用互斥锁。 I don't quite understand how mutexes interact with other threads though.void do_stuff_simultaneously() {
std::cout << "Printing output to terminal..." << std::endl;
}
// main thread
void do_stuff() {
// launch a separate thread that may print to stdout
std::thread t(do_stuff_simultaneously);
// lock stdout somehow
// redirect stdout to string variable
festival_eval_command("(print utt2)");
// unlock stdout
}
Does the locking of stdout prevent do_stuff_simultaneously
from accessing it? stdout 的锁定是否会阻止
do_stuff_simultaneously
访问它? Is there a way to make stdout thread-safe like this?有没有办法像这样使 stdout 线程安全?
However, my program is multi-threaded, so other threads may be using
stdout
simultaneously但是,我的程序是多线程的,所以其他线程可能同时使用
stdout
The outputs of threads are going to be interleaved in a fashion you cannot control.线程的输出将以您无法控制的方式交错。 Unless each thread writes its entire output using one
std::cout.write
(see below for why).除非每个线程使用一个
std::cout.write
写入其整个输出(原因见下文)。
Is there a way I can safely retrieve the just results of
third_party_eval()
fromstdout
in a multi-threaded program?有没有一种方法可以安全地从多线程程序中的
stdout
检索third_party_eval()
结果?
Each thread must run that function in a separate process, from which you capture its stdout
into a std::string s
(different one for each process).每个线程必须在一个单独的进程中运行该函数,从中您可以将其
stdout
捕获到std::string s
(每个进程不同的一个)。
Then in parent process you write that std::string
into stdout
with std::cout.write(s.data(), s.size())
.然后在父进程中,您使用
std::cout.write(s.data(), s.size())
将该std::string
写入stdout
。 std::cout.write
locks a mutex (to protect itself from data race and corruption when multiple threads write into it in any way, including operator<<
), so that the output of one process is not interleaved with anything else. std::cout.write
锁定互斥锁(以在多个线程以任何方式写入其中时保护自身免受数据竞争和损坏,包括operator<<
),以便一个进程的输出不会与其他任何内容交错。
Note up front: This shows why globals are often a bad idea!预先注意:这说明了为什么全局变量通常是个坏主意! Even more, library code (ie code intended for re-use in different contexts) should never use globals.
更重要的是,库代码(即用于在不同上下文中重用的代码)永远不应该使用全局变量。 This is also something to tell the supplier of that code, they should fix their library to provide a version that at least takes an output filedescriptor instead of writing to
stdout
.这也是告诉该代码的供应商的事情,他们应该修复他们的库以提供一个至少采用输出文件描述符而不是写入
stdout
。
Here's what I would consider doing: Move the whole function execution to a separate process.这是我会考虑做的事情:将整个函数执行移动到一个单独的进程。 That way, if multiple threads need to run it, they will start separate processes with separate outputs that they can process independently.
这样,如果多个线程需要运行它,它们将启动具有单独输出的单独进程,它们可以独立处理这些输出。
An alternative way is to wrap this single function.另一种方法是包装这个单一的功能。 This wrapper does all the IO redirection and it (being a critical section) is guarded by a mutex, so that two threads invoking the wrapper will be serialized.
此包装器执行所有 IO 重定向,并且它(作为关键部分)由互斥锁保护,因此调用包装器的两个线程将被序列化。 However, this has downsides, because in the meantime, that code still messes with your process' standard streams (so a stray call to output something would be mixed into the function output).
但是,这也有缺点,因为与此同时,该代码仍然与您的流程的标准流混淆(因此,对输出某些内容的杂散调用会混入函数输出中)。
A second alternative is to put the function into a wrapper process who's only goal is to serialize the use of the function.第二种选择是将函数放入一个包装进程,其唯一目标是序列化函数的使用。 You'd start that process on demand or on start of your application and use some form of IPC to communicate with it.
您可以按需或在应用程序启动时启动该过程,并使用某种形式的 IPC 与其进行通信。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.