简体   繁体   English

如何避免不良的FD_SET缓冲区溢出崩溃?

[英]How to avoid bad FD_SET buffer overflow crash?

Recently I have been bitten by the FD_SET buffer overflow twice. 最近,我被FD_SET缓冲区溢出咬了两次。 The first time is we have too much socket (1024+) to added into the FD_SET . 第一次是我们将太多套接字(1024+)添加到FD_SET This is an test case, we have disabled it, and add assert to detect this case. 这是一个测试用例,我们已禁用它,并添加assert来检测此用例。

Today we hit another related issue when we run an test case for 1000+ times. 今天,当我们运行一个测试用例超过1000次时,我们遇到了另一个相关问题。 Each time, the test case will somehow trigger to allocate an socket, and later release it before the test case finished. 每次,测试用例都会以某种方式触发分配一个套接字,然后在测试用例完成之前释放它。 This test case will hit FD_SET buffer overflow when we run 1000+ time. 当我们运行1000多次时,此测试用例将达到FD_SET缓冲区溢出。

We have find the root cause: 我们找到了根本原因:

  1. For each pass, the allocate socket id will increase(+1), it will not reuse the socket id in a long time. 对于每一次传递,分配的套接字ID将增加(+1),它不会长时间重用套接字ID。 The Operating system is MAC , and I think it is an reasonable design to avoid using already released socket without error happen. Operating systemMAC ,我认为这是一个合理的设计,以避免使用已经发布的套接字而不会发生错误。
  2. FD_SET only set the fd_set bit array using socket id as index, if the socket id is large, it will overflow. FD_SET仅使用套接字ID作为索引来设置fd_set位数组,如果套接字ID很大,则会溢出。 I think fd_set is an bad design. 我认为fd_set是一个错误的设计。

We think the 1000+ is an reasonable number. 我们认为1000+是一个合理的数字。 And we don't think define MACRO to set 'fd_set' huge is not reasonable and wasting memory and CPU when wait. 而且我们认为定义MACRO将'fd_set'设置为巨大并不合理,并且在等待时浪费内存和CPU。

We doesn't know how to resolve it, so any suggestion? 我们不知道如何解决它,所以有什么建议吗?

-------------Edit1---------------- ------------- EDIT1 ----------------

It turn out there is socket leak in other place, which violate destructor should release all resource. 原来在其他地方有套接字泄漏,这违反了析构函数应释放所有资源。 And this make the socket id increase. 这会使套接字id增加。 So item #1 is not true. 因此,项目#1不正确。 Operating system will reuse the socket id. 操作系统将重用套接字ID。 But anyway, the discuss is helpful, and the FD_SET is bad design, and we should using poll() . 但是无论如何,讨论是有帮助的,并且FD_SET是错误的设计,我们应该使用poll()

This answer summarizes the solution found by the OP, and comments by rob mayoff and Joseph Quinsey. 该答案总结了OP找到的解决方案,以及rob mayoff和Joseph Quinsey的评论。

If a program is not reusing a file descriptor (what you called a 'socket id'), it is not closing the file descriptor. 如果程序没有重用文件描述符(您称为“套接字ID”),则它不是在关闭文件描述符。 Try running lsof on your test program when it's been running for a while. 当测试程序运行了一段时间后,请尝试在其上运行lsof You will probably find many open sockets in the output. 您可能会在输出中找到许多打开的套接字。 (But the OP says lsof -g PID doesn't seem to work on debugged process). (但是OP表示lsof -g PID似乎不适用于调试的进程)。

Alternatively, try netstat -a -p --inet | grep process-name-or-pid 或者,尝试netstat -a -p --inet | grep process-name-or-pid netstat -a -p --inet | grep process-name-or-pid . netstat -a -p --inet | grep process-name-or-pid

On some systems, sometimes a simple close(fd) for a socket is not sufficient. 在某些系统上,有时仅使用套接字的close(fd)是不够的。 If your socket file descriptors are constantly increasing, then the answer close() is not closing socket properly might help. 如果套接字文件描述符不断增加,则答案close()不能正确关闭套接字可能会有所帮助。

To avoid the problem with FD_SETSIZE, several writers, for example Increasing limit of FD_SETSIZE and select , suggest using poll rather than select . 为了避免FD_SETSIZE的问题,例如, 增加FD_SETSIZE的限制和select的多个编写器建议使用poll而不是select

Finally, the OP solved the issue: 最后,OP解决了该问题:

It turned out there was socket leak in another place, which violate destructor should release all resource. 原来在另一个地方有套接字泄漏,这违反了析构函数应释放所有资源。 And this made the socket id increase. 这使套接字id增加。 Fixed, the operating system will reuse the socket id. 已修复,操作系统将重用套接字ID。

But anyway, the discussion is helpful, and the FD_SET is bad design, and we should using poll() . 但是无论如何,这种讨论是有帮助的,并且FD_SET是错误的设计,我们应该使用poll()

Note that Unix-like systems always (or usually) use the smallest available file descriptor. 请注意,类Unix系统始终 (或通常)使用最小的可用文件描述符。 For example, the man page for open(2) states; 例如, open(2)状态的手册页;

The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process. 成功调用返回的文件描述符将是当前未为该进程打开的编号最小的文件描述符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM