简体   繁体   English

在Linux上,当使用C与端口0(选择随机端口)执行套接字绑定时,我得到errno 98,地址已在使用中。 那怎么可能?

[英]On linux, when performing a socket bind with port 0 (pick a random port) using C, I get errno 98, Address already in use. How is that possible?

So, we have a long standing commercial product, that is well established and I've never seen this type of issue before. 因此,我们有一个长期存在的商业产品,它已经建立了很好的地位,而且我以前从未见过此类问题。 We use a client program to send data to a server. 我们使用客户端程序将数据发送到服务器。 Sometimes, because of firewalls in customer environments, we allow the end user to specify outbound port ranges to bind, however, in this particular issue i'm seeing, we're not doing that, and are using port 0 to perform a bind. 有时,由于客户环境中的防火墙,我们允许最终用户指定要绑定的出站端口范围,但是,在我看到的这个特定问题中,我们没有这样做,而是使用端口0进行绑定。 From everything i've read, this means to pick a random port. 从我阅读的所有内容中,这意味着选择一个随机端口。 But what I can't find out is, what does that mean to the kernel/OS. 但是我不知道的是,这对内核/操作系统意味着什么。 If i'm asking for a random port, how can that already be in use? 如果我要一个随机端口,那该如何使用呢? Strictly speaking, only the unique pairing of src ip/src port & dst ip/port make the connection unique. 严格来说,只有src ip / src端口和dst ip /端口的唯一配对才能使连接唯一。 I believe the same port can be used, if talking to another destination ip, but maybe that's not relevant here. 我认为,如果要与另一个目标IP通信,则可以使用相同的端口,但这可能与此处无关。

Also, this doesn't happen on all the customer's systems, only some. 而且,这并非在所有客户系统上都发生,只有某些系统会发生。 So, this may be some form of load related issue. 因此,这可能是某种形式的与负载相关的问题。 The systems are fairly busy i'm told. 有人告诉我系统很忙。

Here is the code we're using. 这是我们正在使用的代码。 I left out some of the ifdef code for windows, and left out what we do after the bind for shortness. 我省略了一些用于Windows的ifdef代码,并省略了绑定后的处理方式。

    _SocketCreateClient(Socket_pwtP sock, SocketInfoP sInfo )
{
int nRetries;                       /* number of times to try connect()  */
unsigned short port;
BOOL success = FALSE;
BOOL gotaddr = FALSE;
char buf[INET6_ADDRSTRLEN] ="";
int connectsuccess =1;
int ipv6compat =0;

#ifdef SOCKET_SEND_TIMEOUT
struct timeval time;
#endif /* SOCKET_SEND_TIMEOUT */

nRetries = sInfo->si_nRetries;
sock->s_hostName = strdup(sInfo->si_hostName);

#ifdef DEBUG_SOCKET
LogWrite(LogF,LOG_WARNING,"Socket create client");
LogWrite(LogF,LOG_WARNING,"Number of retries = %d", nRetries);
#endif

ipv6compat = GetIPVer();
if (ipv6compat == -1) /* ipv6 not supported */
    gotaddr = GetINAddr(sInfo->si_hostName, &sock->s_sAddr.sin_addr);
else
    gotaddr = GetINAddr6(sInfo->si_hostName, &sock->s_sAddr6.sin6_addr);

/* translate supplied host name to an internet address */
if (!gotaddr) {
                        /* print this message only once */
                        if ( sInfo->si_logInfo && ( sInfo->si_nRetries == 1 ) )
                        {
                           LogWrite(LogF, LOG_ERR,
           "unable to resolve ip address for host '%s'", sInfo->si_hostName);
                        }
                        sock = _SocketDestroy(sock);
}

else {

    if (ipv6compat == 1) /* ipv6 supported */
    {
            /* try to print the address in sock->s_sAddr6.sin6_addr to make sure it's good.  from call above */
            LogWrite(LogF, LOG_DEBUG2, "Before call to inet_ntop");
            inet_ntop(AF_INET6, &sock->s_sAddr6.sin6_addr, buf, sizeof(buf));
            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_addr from GetINAddr6: %s", buf);


            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_scope_id from if_nametoindex: %d", sock->s_sAddr6.sin6_scope_id);

            LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_type: %d", sock->s_type);
    }


    /* try to create the socket nRetries times */
    while (sock && sock->s_id == INVALID_SOCKET) {
        int socketsuccess = FALSE;

        /* create the actual socket */

        if (ipv6compat == -1) /* ipv6 not supported */
            socketsuccess = sock->s_id = socket(AF_INET, sock->s_type, 0);
        else
            socketsuccess = sock->s_id = socket(AF_INET6, sock->s_type, 0);

        if ((socketsuccess) == INVALID_SOCKET) {
            GETLASTERROR;
            LogWrite(LogF, LOG_ERR, "unable to create socket: Error %d: %s", errno,
            strerror(errno) );
            sock = _SocketDestroy(sock);
        }
        else
        {

             /* cycle through outbound port range for firewall support */
            port = sInfo->si_startPortRange;
         while ( !success && port <= sInfo->si_endPortRange ) {
                    int bindsuccess = 1;

             /* bind to outbound port number */
                    if ( ipv6compat == -1) /* ipv6 not supported */
                    {
                            sock->s_sourceAddr.sin_port   = htons(port);
                            bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr,
                                             sizeof(sock->s_sourceAddr));

                    }

                    else {
                            sock->s_sourceAddr6.sin6_port   = htons(port);
                            inet_ntop(AF_INET6, &sock->s_sourceAddr6.sin6_addr, buf, sizeof(buf));
                            LogWrite(LogF, LOG_DEBUG,
                                            "attempting bind to s_sourceAddr6 %s ", buf);

                            bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr6,
                                             sizeof(sock->s_sourceAddr6));
                    }

                     if (bindsuccess == -1) {
                            GETLASTERROR;
                            LogWrite(LogF, LOG_ERR,
                                    "unable to bind port %d to socket: Error %d: %s. Will attempt next port if protomgr port rules configured(EAV_PORTS).", port, errno, strerror(errno) );

                            /* if port in use, try next port number */
                          port++;
              }
              else {
                    /* only log if outbound port was specified */
                    if (port != 0)
                             {
                               if ( sInfo->si_sourcehostName ) {
                                  LogWrite(LogF, LOG_DEBUG,
                                       "bound outbound address %s:%d to socket",
                                             sInfo->si_sourcehostName, port);
                               }
                               else {
                                  LogWrite(LogF, LOG_DEBUG,
                                       "bound outbound port %d to socket", port);
                               }
                            }
                            success = TRUE;
              }


         }
        }
    }
}
return(sock);
}

The errors we're seeing in our log file look like this. 我们在日志文件中看到的错误如下所示。 It's making 2 tries and both fail: 它正在尝试2次,但都失败了:

protomgr[628453] : ERROR: unable to bind port 0 to socket: Error 98: Address already in use. protomgr [628453]:错误:无法将端口0绑定到套接字:错误98:地址已在使用中。 Will attempt next port if protomgr port rules configured(EAV_PORTS). 如果配置了protomgr端口规则(EAV_PORTS),将尝试下一个端口。

protomgr[628453] : ERROR: unable to bind port(s) to socket: Error 98: Address already in use. protomgr [628453]:错误:无法将端口绑定到套接字:错误98:地址已在使用中。 Consider increase the number of EAV_PORTS if this msg is from protomgr. 如果此消息来自protomgr,请考虑增加EAV_PORTS的数量。

protomgr[628453] : ERROR: unable to bind port 0 to socket: Error 98: Address already in use. protomgr [628453]:错误:无法将端口0绑定到套接字:错误98:地址已在使用中。 Will attempt next port if protomgr port rules configured(EAV_PORTS). 如果配置了protomgr端口规则(EAV_PORTS),将尝试下一个端口。

protomgr[628453] : ERROR: unable to bind port(s) to socket: Error 98: Address already in use. protomgr [628453]:错误:无法将端口绑定到套接字:错误98:地址已在使用中。 Consider increase the number of EAV_PORTS if this msg is from protomgr. 如果此消息来自protomgr,请考虑增加EAV_PORTS的数量。

So, it looks like this was related to the system running out of available ports, and it being configured to only have about 9000 port available. 因此,这似乎与系统用尽可用端口有关,并且已将其配置为仅具有约9000个可用端口。

This setting, in /etc/sysctl.conf controls the available ports: net.ipv4.ip_local_port_range = 9000 65500 /etc/sysctl.conf中的此设置控制可用端口:net.ipv4.ip_local_port_range = 9000 65500

the first number is the starting port, and the second is the max. 第一个数字是起始端口,第二个是最大值。 This example was pulled from a unaltered Suse Enterprise linux server 11.0. 本示例是从未更改的Suse Enterprise linux服务器11.0中提取的。 The customer of ours who reported this problem had their configured in such a way it only had around 9000 ports available in the range they defined, and all were used on the system. 报告此问题的客户对他们的配置方式如下:在他们定义的范围内,只有大约9000个可用端口,并且所有端口都在系统上使用。

Hopefully, this helps someone else in the future. 希望这对将来的人有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM