简体繁体 English

C ++ / Linux：如何编写使用套接字的线程安全库？

[英]C++/Linux: how do you write a thread-safe library that uses sockets?

原文 2012-11-11 14:09:49 0 2 c++/ linux/ multithreading/ thread-safety/ fork

I want to write a library in C++ under Linux that will help an application to use a certain protocol (FastCGI, actually). 我想在Linux下用C ++写一个库，它将帮助应用程序使用某种协议（实际上是FastCGI）。 The library will listen to a socket (either TCP or Unix), receive requests, forward them to user code, and send responses generated by said user code. 该库将侦听套接字（TCP或Unix），接收请求，将请求转发给用户代码，并发送由所述用户代码生成的响应。

There will be many connections on the socket and each connection will carry many requests (possibly simultaneously - there is an interleaving mechanism). 套接字上将有许多连接，并且每个连接将承载许多请求（可能同时进行-有一个交错机制）。 The user code (which uses the library) will most likely be multithreaded in order to process several requests in parallel. 用户代码（使用该库）很可能是多线程的，以便并行处理多个请求。

I'd like my library to be robust and make as little assumptions/requirements about the user code as possible, including the type of multithreading used. 我希望我的库健壮，并尽可能少地对用户代码（包括使用的多线程类型）进行假设/要求。 As I understand, the clone() function in Linux can fork a process in dozens of different manners - with or without shared memory, shared file handles, etc. The decision of HOW to implement multithreading should be left to the user. 据我了解，Linux中的clone()函数可以用多种不同的方式派生一个进程-有无共享内存，共享文件句柄等。如何实现多线程的决定应留给用户。

And this confuses me, because the library code can suddenly find itself fork() 'ed, and multiple copies of the code can be suddenly reading from the same socket and handling the same request. 这使我感到困惑，因为库代码可以突然发现自己为fork()编辑，并且代码的多个副本可以突然从同一套接字读取并处理同一请求。 Even worse - the parent process might terminate, leaving only child processes, which in turn spawn more child processes, perhaps even in different process namespaces - it's a mess. 更糟糕的是-父进程可能会终止，仅留下子进程，这又会产生更多的子进程，甚至可能在不同的进程名称空间中-情况一团糟。

What are the Linux facilities that help to coordinate all the copies of the same code which need to access the same external resource (a socket)? 有哪些Linux设施可以帮助协调需要访问同一外部资源（套接字）的同一代码的所有副本？ What is the standard way of implementing such thread-safe libraries? 实现此类线程安全库的标准方法是什么？ Must I choose a threading model myself and impose that upon the consumers of my library? 我是否必须自己选择一个线程模型并将其强加给我的库的使用者？

2 个解决方案

Don't use directly clone (reserve clone to implementors of threading libraries like pthread ). 不要直接使用clone （储备clone到线程库一样的实现者 pthread ）。 Don't use a lot of fork -s (probably none). 不要使用太多的fork -s（可能没有）。 Go using pthread -s. 使用pthread -s。

You could look at the design of the libonion library. 您可以看一下libonion库的设计。 It is small, implements HTTP server protocol, so is quite similar to your goals. 它很小，实现HTTP服务器协议，因此与您的目标非常相似。

^{libonion gives the users various modes for creating or not threads for requests.} ^{libonion为用户提供了各种创建或不创建请求线程的模式。}

You could have options similar to libonion -s about creating, or not, a new thread for each FastCGI request. 您可以使用与libonion类似的选项-s为每个FastCGI请求创建或不创建新线程。

You might perhaps want to use some event looping library like libevent or libev (around a poll(2) -ing loop). 您可能想要使用一些事件循环库，例如libevent或libev （围绕poll（2） -ing循环）。

And read good books, notably Advanced Linux Programming , and some tutorial on Pthread -s before starting coding. 在开始编码之前，请阅读好书，特别是Advanced Linux Programming和有关Pthread -s的一些教程。

Also, study the source code of several free software libraries similar to your goals. 另外，研究一些与您的目标相似的免费软件的源代码。

At the risk of seemingly going off at a tangent I'd recommend implementing fastcgi on a single thread per processor basis. 冒着切线危险的风险，我建议在每个处理器的单个线程上实现fastcgi。

Reasons: 原因：

More robust. 更强大。
Avoids context-switching overhead associated with multi-threading and protects you from issues like concurrency deadlocks. 避免了与多线程相关的上下文切换开销，并保护您免受并发死锁之类的问题的困扰。
Avoids process fork() costs (although quite light it all adds up) and protects you from dealing with potential child zombie processes amongst other headaches. 避免了流程fork（）的成本（尽管相当轻巧），并保护您免受潜在的儿童僵尸进程的困扰。

This would leave you with the choice of implementing the fastcgi interface using : 这将使您选择使用以下方式实现fastcgi接口：

Non-blocking synchronous I/O ( Reactor design pattern): block until a read or write request comes in, pass request to the appropriate handler and then block until the next request comes in. 非阻塞同步I / O （ Reactor设计模式）：阻塞直到出现读或写请求，再将请求传递给适当的处理程序，然后阻塞直到下一个请求进入。
Asynchronous I/O ( Proactor design pattern): pass read and write requests to the operating system where the O/S supports I/O completion events. 异步I / O （ Proactor设计模式）：将读写请求传递到O / S支持I / O完成事件的操作系统。 On Windows that would be IO completion ports and on Linux something like epoll() . 在Windows上是IO完成端口，在Linux上是epoll（）之类的东西。