简体繁体 English

Boost：是否有一个类似于interprocess :: message_queue的机制，用于纯线程通信？

[英]Boost: is there an interprocess::message_queue-like mechanism for thread-only communication?

原文 2013-10-24 20:57:25 8 2 c++/ multithreading/ boost

The boost::interprocess::message_queue mechanism seems primarily designed for just that: interprocess communication. boost :: interprocess :: message_queue机制似乎主要是为此而设计的：进程间通信。

The problem is that it serializes the objects in the message: 问题是它序列化了消息中的对象：

"A message queue just copies raw bytes between processes and does not send objects." “消息队列只是在进程之间复制原始字节，而不是发送对象。”

This makes it completely unsuitable for fast and repeated interthread communication with large composite objects being passed. 这使得它完全不适合与传递大型复合对象的快速和重复的线程间通信。

I want to create a message with a ref/shared_ptr/pointer to a known and previously-created object and safely pass it from one thread to the next. 我想创建一个带有ref / shared_ptr /指向已知和以前创建的对象的消息，并安全地将它从一个线程传递到下一个线程。

You CAN use asio::io_service and post with bind completions, but that's rather klunky AND requires that the thread in question be using asio, which seems a bit odd. 您可以使用asio :: io_service并使用绑定完成进行发布，但这是相当笨重的并且要求相关线程使用asio，这看起来有点奇怪。

I've already written my own, sadly based on asio::io_service, but would prefer to switch over to a boost-supported general mechansim. 我已经编写了自己的，遗憾的是基于asio :: io_service，但更愿意切换到支持boost的通用mechansim。

2 个解决方案

You need a mechanism, that designed for interprocess communication because separate processes has separate address space and you cannot simply pass pointers except very spacial cases. 您需要一种为进程间通信而设计的机制，因为单独的进程具有单独的地址空间，除了非常空间的情况之外，您不能简单地传递指针。 For thread communication you can use standard containers like std::stack , std::queue and std::priority_queue to communicate between threads, you just need to provide proper synchronization through mutexes. 对于线程通信，您可以使用标准容器（如std::stack ， std::queue和std::priority_queue在线程之间进行通信，您只需通过互斥锁提供正确的同步。 Or you can use lock-free containers, which also provided by boost. 或者你可以使用无锁的容器，这也是由boost提供的。 What else would you need for interthread communication? 您还需要什么才能进行线程间通信？

Whilst I'm no expert in Boost per se, there is a fundamental difficulty in communicating between processes and threads via a pipe, message queue, etc, especially if it is assumed that a program's data is classes containing dynamically allocated memory (which is pretty much the case for things written with Boost; a string is not a simple object like it is in C...). 虽然我不是Boost本身的专家，但是通过管道，消息队列等在进程和线程之间进行通信存在根本的困难，特别是如果假设程序的数据是包含动态分配的内存的类（这很漂亮）使用Boost编写的东西的情况很多;字符串不是像C中那样的简单对象...）。

Copying of Data in Classes 在类中复制数据

Message queues and pipes are indeed just a way of passing a collection of bytes from one thread/process to another thread/process. 消息队列和管道确实只是将一个字节集合从一个线程/进程传递到另一个线程/进程的方法。 Generally when you use them you're looking for the destination thread to end up with a copy of the original data, not just a copy of the references to the data (which would be pointing back at the original data). 通常，当您使用它们时，您正在寻找目标线程，最终得到原始数据的副本，而不仅仅是数据引用的副本（它将指向原始数据）。

With a simple C struct containing no pointers at all it's easy; 使用一个简单的C结构，根本不包含任何指针，这很容易; a copy of the struct contains all the data, no problem. struct的副本包含所有数据，没问题。 But a C++ class with complex data types like strings is now a structure containing references / pointers to allocated memory. 但是具有复杂数据类型（如字符串）的C ++类现在是一个包含对已分配内存的引用/指针的结构。 Copy that structure and you haven't actually copied the data in the allocated memory. 复制该结构，但实际上并未将数据复制到已分配的内存中。

That's where serialisation comes in. For interprocess communications where both processes can't ordinarily share the same memory serialisation serves as a way of parcelling up the structure to be sent plus all the data it refers to into a stream of bytes that can be unpacked at the other end. 这就是序列化的用武之地。对于进程间通信，其中两个进程通常不能共享相同的内存序列化，这是一种将要发送的结构加上它引用的所有数据的方式，可以解压缩为可以解压缩的字节流另一端。 For threads it's no different if you don't want the two threads accessing the same memory at the same time. 对于线程，如果您不希望两个线程同时访问同一个内存，则没有什么不同。 Serialisation is a convenient way of saving yourself having to navigating through a class to see exactly what needs to be copied. 序列化是一种方便的方法，可以节省自己必须在类中导航以确切地查看需要复制的内容。

Efficiency 效率

I don't know what Boost uses for serialisation, but clearly serialising to XML would be painfully inefficient. 我不知道Boost用于序列化的是什么，但显然序列化到XML将是非常低效的。 A binary serialisation like ASN.1 BER would be much faster. 像ASN.1 BER这样的二进制序列化会快得多。

Also, copying data through pipes, message queues is no longer as inefficient as it used to be. 此外，通过管道复制数据，消息队列不再像过去那样低效。 Traditionally programmers don't do it because of the perceived waste of time spent copying the data repeatedly just to share it with another thread. 传统上程序员不会这样做，因为人们认为浪费了重复复制数据只是为了与另一个线程共享数据所花费的时间。 With a single core machine that involves a lot of slow and wasteful memory accesses. 使用单核心机器会涉及大量缓慢且浪费的内存访问。

However, if one considers what "memory access" is in these days of QPI, Hypertransport, and so forth, it's not so very different to just copying the data in the first place. 但是，如果考虑到QPI，Hypertransport等目前的“内存访问”是什么，那么首先复制数据并没有太大的不同。 In both cases it involves data being sent over a serial bus from one core's memory controller to another core's cache. 在这两种情况下，它都涉及通过串行总线从一个内核的内存控制器发送到另一个内核缓存的数据。

Today's CPUs are really NUMA machines with memory access protocols layered on top of serial networks to fake an SMP environment. 今天的CPU实际上是NUMA机器，其内存访问协议分层在串行网络之上，以伪造SMP环境。 Programming in the style of copying messages through pipes, message queues, etc. is definitely edging towards saying that one is content with the idea of NUMA, and that really you don't need SMP at all. 通过管道，消息队列等复制消息的方式进行编程肯定会说明一个人满足于NUMA的想法，而且你根本不需要SMP。

Also, if you do all your inter-thread communications as message queues, they're not so very different to pipes, and pipes aren't so different to network sockets (at least that's the case on Not-Windows). 此外，如果您将所有线程间通信作为消息队列进行，它们与管道的差别不大，并且管道与网络套接字没有那么不同（至少在非Windows上就是这种情况）。 So if you write your code carefully you can end up with a program that can be redeployed across a distributed network of computers or across a number of threads within a single process. 因此，如果您仔细编写代码，最终可能会得到一个程序，该程序可以通过分布式计算机网络或单个进程中的多个线程进行重新部署。 That's a nice way of getting scalability because you're not changing the shape or feel of your program in any significant way when you scale up. 这是一种获得可扩展性的好方法，因为在扩展时，您不会以任何显着的方式改变程序的形状或感觉。

Fringe Benefits 附加福利

Depending on the serialisation technology used there can be some fringe benefits. 根据所使用的序列化技术，可以有一些附带的好处。 With ASN.1 you specify a message schema in which you set out the valid ranges of the message's contents. 使用ASN.1，您可以指定一个消息模式，在该模式中可以设置消息内容的有效范围。 You can say, for example, that a message contains an integer, and it can have values between 0 and 10. The encoders and decoders generated by decent ASN.1 tools will automatically check that the data you're sending or receiving meets that constraint, and returns errors if not. 例如，您可以说消息包含一个整数，并且它可以具有0到10之间的值。由不错的ASN.1工具生成的编码器和解码器将自动检查您发送或接收的数据是否满足该约束，如果没有，则返回错误。

I would be surprised if other serialisers like Google Protocol Buffers didn't do a similar constraints check for you. 如果Google协议缓冲区之类的其他序列化程序没有对您进行类似的约束检查，我会感到惊讶。

The benefit is that if you have a bug in your program and you try and send an out of spec message, the serialiser will automatically spot that for you. 好处是，如果您的程序中有错误并且您尝试发送超出规范的消息，则序列化程序将自动为您发现该错误。 That can save a ton of time in debugging. 这可以节省大量的调试时间。 Also it is something you definitely don't get if you share a memory buffer and protect it with a semaphore instead of using a message queue. 如果您共享内存缓冲区并使用信号量而不是使用消息队列来保护它，那么它肯定是您无法获得的。

CSP CSP

Communicating Sequential Processes and the Actor model are based on sending copies of data through message queues, pipes, etc. just like you're doing. 通信顺序进程和Actor模型基于通过消息队列，管道等发送数据副本，就像您正在做的那样。 CSP in particular is worth paying attention to because it's a good way of avoiding a lot of the pitfalls of multi-threaded software that can lurk undetected in source code. CSP尤其值得关注，因为它是一种避免许多潜在的多线程软件陷阱的好方法，这些陷阱可能潜伏在源代码中未被发现。

There are some CSP implementations you can just use. 您可以使用一些CSP实现。 There's JCSP, a class library for Java, and C++CSP, built on top of Boost to do CSP for C++. 有JCSP，一个用于Java的类库和C ++ CSP，它构建在Boost之上，用于为C ++做CSP。 They're both from the University of Kent. 他们都来自肯特大学。

C++CSP looks quite interesting. C ++ CSP看起来很有趣。 It has a template class called csp::mobile, which is kind of like a Boost smart pointer. 它有一个名为csp :: mobile的模板类，它有点像Boost智能指针。 If you send one of these from one thread to another via a channel (CSP's word for a message queue) you're sending the reference, not the data. 如果您通过一个通道（CSP的消息队列字）将其中一个从一个线程发送到另一个线程，您将发送引用，而不是数据。 However, the template records which thread 'owns' the data. 但是，模板记录哪个线程“拥有”数据。 So a thread receiving a mobile now owns the data (which hasn't actually moved), and the thread that sent it can no longer access it. 因此，接收移动设备的线程现在拥有数据（实际上没有移动），并且发送它的线程无法再访问它。 So you get the benefits of CSP without the overhead of copying the data. 因此，您可以获得CSP的好处，而无需复制数据的开销。

It also looks like C++CSP is able to do channels over TCP; 它看起来像C ++ CSP能够通过TCP做通道; that's a very attractive feature, up scaling is a really simple possibility. 这是一个非常有吸引力的功能，向上扩展是一个非常简单的可能性。 JCSP works over network connections too. JCSP也适用于网络连接。