简体   繁体   中英

tcp/ip accept not returning, but client does

server:
vxworks 6.3
calls the usual socket, bind, listen, then:

for (;;)
{
  client = accept(sfd,NULL,NULL);
  // pass client to worker thread
}

client:
.NET 2.0
TcpClient constructor to connect to server that takes the string hostname and int port, like:

TcpClient client = new TcpClient(server_ip, port);

This is working fine when the server is compiled and executed in windows (native c++).

intermittently, the constructor to TcpClient will return the instance, without throwing any exception, but the accept call in vxWorks does not return with the client fd. tcpstatShow indicates no accept occurred.

What could possibly make the TcpClient constructor (which calls 'Connect') return the instance, while the accept call on the server not return? It seems to be related to what the system is doing in the background - it seems more likely to get this symptom to occur when the server is busy persisting data to flash or an NFS share when the client attempts to connect, but can happen when it isn't also.

I've tried adjusting priority of the thread running accept
I've looked at the size of the queue in 'listen'. There's enough.
The total number of file descriptors available should be enough (haven't validated this yet though, first thing in the morning)

您是否有可能发布有线通讯/网络上发生的事情?

It could be many reasons, however we won't know unless we can get more information from the server and client side. Does it throw out any errors? A list of TCP/IP errors can be found here Windows Socket Error. On the server side, are you catching any exceptions? Maybe you can try closing the connection (with linger of 1 second) after it has an error?

Is it possible to bind the server on another port and see if it accepts there? If the client returns it sounds like it's getting an accept from something on your server. I do not know about vxworks but in Windows you should always try to not bind to anything under 1000.

Your server's accept() call looks wrong. The POSIX accept() call that I know has:

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); 

where *addr is a required pointer that gets written to if the call works—indeed, one of the failure states for the call is:

[EFAULT]    The address parameter is not in a writable part of the user address space.

I haven't done Windows socket programming, but I understand it's POSIX-compliant, and Beej's guide doesn't mention any exceptions for Windows for accept() , so this should still apply. Somewhat relevant, the Python accept() call also 'returns' the address field (I say somewhat since Python did its best to emulate the C networking API as it made sense.)

I would suggest checking errno and using perror after the accept call in the server, to see if [EFAULT] is set (it will also inform you if you ran out of descriptors, as errno gets set to [EMFILE] or [ENFILE] )

If that doesn't prove to be the issue, use ncat , as either server or client, to investigate further. I'd run it with -vv since you want to know exactly when connections are made, what's sent etcetera.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM