简体   繁体   中英

Weird NetworkStream reading issue

Goal: execute Read() method of NetworkStream / SslStream only when it wouldn't block.

Here's what I have and it's a workaround :

    /// <summary>
    /// Blocks the current thread until this stream has data available for reading or the token is canceled.
    /// </summary>
    /// <param name="token">The object that allows this operation to be canceled.</param>
    /// <returns>True if data is available to be read, false if canceled.</returns>
    public bool WaitDataAvailable(CancellationToken token) {
        lock (Socket) {
            while (!Socket.Poll(PollingInterval, SelectMode.SelectRead) && !token.IsCancellationRequested) ;
            Thread.Sleep(1); // THIS IS A ONE CRAZY HACK HERE!!! Why is it necessary?
            return !token.IsCancellationRequested && Socket.Connected;
        }
    }

It took me 2 weeks of banging my head against the wall to figure out the Thread.Sleep(1) workaround. Without this, my communication code just died right after establishing the connection and exchanging a couple of initial messages. I couldn't even figure out why. My code was supposed to just receive the message from one end point, inspect it and pass it to the other end point. Without 1ms wait the client just froze. Then, after I killed my app, the client worked normally as it would received ALL data, no error detected whatsoever.

After a thorough investigation I ruled out all thread synchronization issues in my code. BTW, the code uses blocking operation, only synchronous methods and as few threads as possible. BTW this yielded far better benchmark results than any asynchronous versions, since task creation is quite expensive compared to just directly exchanging messages mostly shorter than 1KB.

So my traffic inspection tool works stable, it's tested and benchmarked, however it drives me crazy I don't understand WHY it works. Without Thread.Sleep(1) it doesn't. All the data is exchanged properly, but the client app freezes until my app is killed.

I also found out ca 5 microseconds of waiting is enough for it to work properly most of the time.

You would probably think why didn't I just check if my code enter the blocking read operation and freezes. Heisenberg. Any debugging like Debug.WriteLine or Console.WriteLine makes it work. Well, sometimes it even work properly when run with Visual Studio Diagnostic Tools, but freezes when run without debugging.

This issue here is very hard to reproduce, because it happens randomly and only when no debugging is involved. It was impossible to reproduce using any test code, only with real world client applications. When I tried to test it connecting .NET based clients I wrote for testing purposes it never froze.

So, why should I wait before reading the stream and what am I waiting for?

BTW, the protocol I transmit uses TLV encoding, so I don't read greedy, I read messages, first the header telling how long the message will be, then the data, which can be incomplete, if I read less than specified in header, I wait for more data to complete the message. I noticed that the freezes I described happen only if I get incomplete messages - the ones that need to be read in more than one read operation. But then again - if there's a bug in my message handling, why does it work perfectly when 1ms lag is introduced?

I understand this problem is very esoteric and probably described vaguely, some would miss the whole source code. But this is what it is. It is not a simple case and cannot be reduced to one. I already tried to test individual parts of the system and found out that this problem exists only in this particular case: TLV message encoding, reading non-greedy, real live LDAP client reading directory tree. If I change anything, like for example the reads to greedy (always maximum avalilable number of bytes) - the issue won't occur. If I try to read the directory with my own .NET application - the issue won't occur. If I insert any debug instructions before reading the stream - the issue won't occur.

The simplest possible case I tested it is using my code as LDAP proxy and reading directory tree with LdapAdmin. I know it's not this particular client fault, because LDP freezes too, but not always.

I hate to say it, but waiting (blocking) on a socket to become available is a terrible terrible idea. Did you know that you can do a zero-length async read, and it will invoke the callback for you when data is available, without you having to give it a buffer in advance? (or at least you can share and re-use a static readonly byte[] ZeroLengthBuffer = new byte[0]; ) Then when data is available, you get a sensible buffer and do some actual reading (sync or async, up to you) - perhaps keeping an eye on .Available on the socket. This zero-length read trick works for either of the Socket async read methods, IIRC - so Socket.BeginReceive or Socket.ReceiveAsync (which despite the name, is not async in the async / await sense, and may complete synchronously).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM