简体   繁体   English

异步,非阻塞套接字行为 - WSAEWOULDBLOCK

[英]Asynchronous, Non-Blocking Socket Behaviour - WSAEWOULDBLOCK

I have inherited two applications, one Test Harness (a client) running on a Windows 7 PC and one server application running on a Windows 10 PC. 我继承了两个应用程序,一个在Windows 7 PC上运行的Test Harness(一个客户端)和一个在Windows 10 PC上运行的一个服务器应用程序。 I am attempting to communicate between the two using TCP/IP sockets. 我试图使用TCP / IP套接字在两者之间进行通信。 The Client sends requests (for data in the form of XML) to the Server and the Server then sends the requested data (also XML) back to the client. 客户端将请求(对于XML形式的数据)发送到服务器,然后服务器将请求的数据(也是XML)发送回客户端。

The set up is as shown below: 设置如下:

       Client                                    Server
--------------------                      --------------------  
|                  |    Sends Requests    |                  |
|   Client Socket  |  ----------------->  |   Server Socket  |
|                  |  <-----------------  |                  |
|                  |      Sends Data      |                  |
--------------------                      --------------------

This process always works on an initial connection (ie freshly launched client and server applications). 此过程始终适用于初始连接(即刚刚启动的客户端和服务器应用程序)。 The client has the ability to disconnect from the server, which triggers cleanup of sockets. 客户端能够断开与服务器的连接,从而触发套接字的清理。 Upon reconnection, I almost always (it does not always happen, but does most of the time) receive the following error: 重新连接后,我几乎总是(它并不总是发生,但大部分时间都会)收到以下错误:

"Receive() - The socket is marked as nonblocking and the receive operation would block"

This error is displayed at the client and the socket in question is an asynchronous, non-blocking socket. 此错误显示在客户端,并且所讨论的套接字是异步的非阻塞套接字。

The line which causes this SOCKET_ERROR is: 导致此SOCKET_ERROR的行是:

numBytesReceived = theSocket->Receive(theReceiveBuffer, 10000));

where:
- numBytesReceived is an integer (int)
- theSocket is a pointer to a class called CClientSocket which is a specialisation of CASyncSocket, which is part of the MFC C++ Library.  This defines the socket object which is embedded within the client.  It is an asynchonous, non-blocking socket.
- Receive() is a virtual function within the CASyncSocket object
- theReceiveBuffer is a char array (10000 elements)

In executing the line descirbed above, SOCKET_ERROR is returned from the function and calling theSocket->GetLastError() returns WSAEWOULDBLOCK . 在执行上面描述的行时,从函数返回SOCKET_ERROR并调用theSocket->GetLastError()返回WSAEWOULDBLOCK

SocketTools highlights that SocketTools强调了这一点

When a non-blocking (asynchronous) socket attempts to perform an operation that cannot be performed immediately, error 10035 will be returned. 当非阻塞(异步)套接字尝试执行无法立即执行的操作时,将返回错误10035。 This error is not fatal, and should be considered advisory by the application. 此错误不是致命错误,应该被应用程序视为建议。 This error code corresponds to the Windows Sockets error WSAEWOULDBLOCK. 此错误代码对应于Windows套接字错误WSAEWOULDBLOCK。

When reading data from a non-blocking socket, this error will be returned if there is no more data available to be read at that time. 从非阻塞套接字读取数据时,如果此时没有更多数据可供读取,则会返回此错误。 In this case, the application should wait for the OnRead event to fire which indicates that more data has become available to read. 在这种情况下,应用程序应等待OnRead事件触发,这表示有更多数据可供读取。 The IsReadable property can be used to determine if there is data that can be read from the socket. IsReadable属性可用于确定是否有可从套接字读取的数据。

When writing data to a non-blocking socket, this error will be returned if the local socket buffers are filled while waiting for the remote host to read some of the data. 将数据写入非阻塞套接字时,如果在等待远程主机读取某些数据时填充本地套接字缓冲区,则会返回此错误。 When buffer space becomes available, the OnWrite event will fire which indicates that more data can be written. 当缓冲区空间可用时,将触发OnWrite事件,表示可以写入更多数据。 The IsWritable property can be used to determine if data can be written to the socket. IsWritable属性可用于确定是否可以将数据写入套接字。

It is important to note that the application will not know how much data can be sent in a single write operation, so it is possible that if the client attempts to send too much data too quickly, this error may be returned multiple times. 重要的是要注意应用程序将不知道在单个写操作中可以发送多少数据,因此如果客户端尝试过快地发送太多数据,则可能会多次返回此错误。 If this error occurs frequently when sending data it may indicate high network latency or the inability for the remote host to read the data fast enough. 如果在发送数据时频繁发生此错误,则可能表示网络延迟较高或远程主机无法足够快地读取数据。

I am consistently getting this error and failing to receive anything on the socket. 一直收到此错误,并且无法在套接字上收到任何内容。

Using Wireshark , the following communications occur with the source, destinaton and TCP Bit Flags presented here: 使用Wireshark ,下面的通信发生在这里提供的source,destinaton和TCP Bit Flags:

Event: Connect Test Harness to Server via TCP/IP 事件:通过TCP / IP将测试工具连接到服务器

Client --> Server: SYN
Server --> Client: SYN, ACK
Client --> Server: ACK

This appears to be correct and represents the Three-Way Handshake of connecting.

SocketSniff confirms that a Socket is closed on the client side.  It was not possible to get SocketSniff to work with the Windows 10 Server application.

Event: Send a Request for Data from the Test Harness 事件:从测试工具发送数据请求

Client --> Server: PSH, ACK
Server --> Client: PSH, ACK
Client --> Server: ACK

Both request data and received data is confirmed to be exchanged successfully

Event: Disconnect Test Harness from Server 事件:从服务器断开测试工具

Client --> Server: FIN, ACK
Server --> Client: ACK
Server --> Client: FIN, ACK
Client --> Server: ACK

This appears to be correct and represents the Four-Way handshake of connection closure.

SocketSniff confirms that a Socket is closed on the client side.  It was not possible to get SocketSniff to work with the Windows 10 Server application.

Event: Reconnect Test Harness to Server via TCP/IP 事件:通过TCP / IP将测试工具重新连接到服务器

Client --> Server: SYN
Server --> Client: SYN, ACK
Client --> Server: ACK

This appears to be correct and represents the Three-Way Handshake of connecting.

SocketSniff confirms that a new Socket is opened on the client side.  It was not possible to get SocketSniff to work with the Windows 10 Server application.

Event: Send a Request for Data from the Test Harness 事件:从测试工具发送数据请求

Client --> Server: PSH, ACK
Server --> Client: ACK

We see no data being pushed (PSH) back to the client, yet we do see an acknowledgement.  

Has anyone got any ideas what may be going on here? 有没有人有任何想法可能会发生在这里? I understand it would be difficult for you to diagnose without seeing the source code, however I was hoping others may have had experience with this error and could point me down the specific route to investigate. 我知道你很难在没有看到源代码的情况下进行诊断,但是我希望其他人可能有过这个错误的经验,并且可能指出我需要调查的具体路线。

More Info: 更多信息:

The Server initialises a listening thread and binds to 0.0.0.0:49720. 服务器初始化监听线程并绑定到0.0.0.0:49720。 The 'WSAStartup()', 'bind()' and 'listen()' functions all return '0', indicating success. 'WSAStartup()','bind()'和'listen()'函数都返回'0',表示成功。 This thread persists throughout the lifetime of the server application. 此线程在服务器应用程序的整个生命周期中持续存在。

The Server initialises two threads, a read and a write thread. 服务器初始化两个线程,即读取和写入线程。 The read thread is responsible for reading request data off its socket and is initialised as follows with a class called Connection: 读线程负责从其套接字读取请求数据,并使用名为Connection的类初始化如下:

HANDLE theConnectionReadThread 
           = CreateThread(NULL,                                    // Security Attributes
                          0,                                       // Default Stacksize
                          Connection::connectionReadThreadHandler, // Callback
                          (LPVOID)this,                            // Parameter to pass to thread
                          CREATE_SUSPENDED,                        // Don't start yet
                          NULL);                                   // Don't Save Thread ID

The write thread is initialised in a similar way. 写线程以类似的方式初始化。

In each case, the CreateThread() function returns a suitable HANDLE, eg 在每种情况下,CreateThread()函数返回一个合适的HANDLE,例如

theConnectionReadThread  = 00000570
theConnectionWriteThread = 00000574  

The threads actually get started within the following function: 线程实际上是在以下函数中启动的:

void Connection::startThreads()
{
    ResumeThread(theConnectionReadThread);
    ResumeThread(theConnectionWriteThread);
}                                   

And this function is called from within another class called ConnectionManager which manages all the possible connections to the server. 此函数在另一个名为ConnectionManager类中调用,该类管理与服务器的所有可能连接。 In this case, I am only concerned with a single connection, for simplicity. 在这种情况下,为简单起见,我只关注单个连接。

Adding text output to the server application reveals that I can successfully connect/disconnect the client and server several times before the faulty behaviour is observed. 将文本输出添加到服务器应用程序后发现,在观察到错误行为之前,我可以多次成功connect/disconnect客户端和服务器。 For example, Within the connectionReadThreadHandler() and connectionWriteThreadHandler() functions, I am outputing text to a log file as soon as they execute. 例如,在connectionReadThreadHandler()connectionWriteThreadHandler()函数中,我一执行就将文本输出到日志文件中。

When correct behaviour is observed, the following lines are output to the log file: 如果观察到正确的行为,则会将以下行输出到日志文件:

Connection::ResumeThread(theConnectionReadThread) returned 1
Connection::ResumeThread(theConnectionWriteThread) returned 1
ConnectionReadThreadHandler() Beginning
ConnectionWriteThreadHandler() Beginning

When faulty behaviour is observed, the following lines are output to the log file: 当观察到错误行为时,以下行将输出到日志文件:

Connection::ResumeThread(theConnectionReadThread) returned 1
Connection::ResumeThread(theConnectionWriteThread) returned 1

The callback functions do not appear to being invoked. 回调函数似乎没有被调用。

It is at this point that the error is displayed on the client indicating that: 此时,错误显示在客户端上,表明:

"Receive() - The socket is marked as nonblocking and the receive operation would block"

On the Client side, I've got a class called CClientDoc , which contains the client side socket code. 在客户端,我有一个名为CClientDoc的类,它包含客户端套接字代码。 It first initialises theSocket which is the socket object which is embedded within a client: 它首先初始化theSocket ,它是嵌入客户端的套接字对象:

private:
    CClientSocket* theSocket = new CClientSocket;

When a connection is initialised between client and server, this class calls a function called CreateSocket() part of which is included below, along with ancillary functions which it calls: 当在客户端和服务器之间初始化连接时,该类调用一个名为CreateSocket()的函数,其中一部分包含在下面,以及它调用的辅助函数:

void CClientDoc::CreateSocket()
{
    AfxSocketInit();
    int lastError;
    theSocket->Init(this);

    if (theSocket->Create()) // Calls CAyncSocket::Create() (part of afxsock.h)
    {
        theErrorMessage = "Socket Creation Successful"; // this is a CString
        theSocket->SetSocketStatus(WAITING);             
    }
    else
    {
        // We don't fall in here
    }
}

void CClientDoc::Init(CClientDoc* pDoc)
{
    pClient = pDoc; // pClient is a pointer to a CClientDoc
}

void CClientDoc::SetSocketStatus(SOCKET_STATUS sock_stat)
{
    theSocketStatus = sock_stat; // theSocketStatus is a private member of CClientSocket of type SOCKET_STATUS
}

Immediately after CreateSocket() , SetupSocket() is called which is also provided here: 紧接着CreateSocket() SetupSocket()被调用这也是在这里提供:

void CClientDoc::SetupSocket()
{
    theSocket->AsyncSelect(); // Function within afxsock.h
}

Upon disconnection of the client from the server, 断开客户端与服务器的连接后,

void CClientDoc::OnClienDisconnect()
{
    theSocket->ShutDown(2); // Inline function within afxsock.inl
    delete theSocket;
    theSocket = new CClientSocket;
    CreateSocket();
    SetupSocket();        
}

So we delete the current socket and then create a new one, ready for use, which appears to work as expected. 因此,我们删除当前套接字,然后创建一个可供使用的新套接字,它似乎按预期工作。

The error is being written on the Client within the DoReceive() function. 错误正在DoReceive()函数中写入客户端。 This function calls the socket to attempt to read in a message. 此函数调用套接字尝试读取消息。

CClientDoc::DoReceive()
{
    int lastError;
    switch (numBytesReceived = theSocket->Receive(theReceiveBuffer, 10000))
    {
    case 0:
        // We don't fall in here
        break;
    case SOCKET_ERROR: // We come in here when the faulty behaviour occurs
        if (lastError = theSocket->GetLastError() == WSAEWOULDBLOCK)
        {
            theErrorMessage = "Receive() - The socket is marked as nonblocking and the receive operation would block";
        }
        else
        {
            // We don't fall in here
        }
        break;
    default:
        // When connection works, we come in here
        break;
    }
}

Hopefully the addition of some of the code proves insightful. 希望添加一些代码证明是有见地的。 I should be able to add a bit more if needed. 如果需要,我应该能够添加更多。

Thanks 谢谢

The WSAEWOULDBLOCK error DOES NOT mean the socket is marked as blocking. WSAEWOULDBLOCK错误并不意味着套接字被标记为阻塞。 It means the socket is marked as non-blocking and there is NO DATA TO READ at that time. 这意味着套接字被标记为非阻塞,并且当时没有数据要读取。

WSAEWOULDBLOCK means the socket WOULD HAVE blocked the calling thread waiting for data if the socket HAD BEEN marked as blocking. WSAEWOULDBLOCK表示如果套接字HAD BEEN标记为阻塞,则套接字WOULD已阻止等待数据的调用线程。

To know when a non-blocking socket has data waiting to be read, use Winsock's select() function, or the CClientSocket::AsyncSelect() method to request FD_READ notifications, or other equivalent. 要知道非阻塞套接字何时等待读取数据,请使用Winsock的select()函数或CClientSocket::AsyncSelect()方法来请求FD_READ通知或其他等效通知。 Don't try to read until there is something to read. 在有东西要读之前不要试着读。

In your analysis, you see the client sending data to the server, but the server is not sending data to the client. 在分析中,您会看到客户端将数据发送到服务器,但服务器未向客户端发送数据。 So you clearly have a logic bug in your code somewhere, you need to find and fix it. 所以你的代码中明显存在逻辑错误,你需要找到并修复它。 Either the client is not terminating its request correctly, or the server is not receiving/processing/replying to it correctly. 客户端未正确终止其请求,或者服务器未正确接收/处理/回复它。 But since you did not show your actual code, we can't tell you what is actually wrong with it. 但由于您没有显示您的实际代码,我们无法告诉您它实际上有什么问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM