简体   繁体   English

奇怪的NetworkStream阅读问题

[英]Weird NetworkStream reading issue

Goal: execute Read() method of NetworkStream / SslStream only when it wouldn't block. 目标:仅在不会阻塞时才执行NetworkStream / SslStream Read()方法。

Here's what I have and it's a workaround : 这是我所拥有的,这是一种解决方法

    /// <summary>
    /// Blocks the current thread until this stream has data available for reading or the token is canceled.
    /// </summary>
    /// <param name="token">The object that allows this operation to be canceled.</param>
    /// <returns>True if data is available to be read, false if canceled.</returns>
    public bool WaitDataAvailable(CancellationToken token) {
        lock (Socket) {
            while (!Socket.Poll(PollingInterval, SelectMode.SelectRead) && !token.IsCancellationRequested) ;
            Thread.Sleep(1); // THIS IS A ONE CRAZY HACK HERE!!! Why is it necessary?
            return !token.IsCancellationRequested && Socket.Connected;
        }
    }

It took me 2 weeks of banging my head against the wall to figure out the Thread.Sleep(1) workaround. 我花了两个星期的时间将头撞在墙上才能弄清楚Thread.Sleep(1)解决方法。 Without this, my communication code just died right after establishing the connection and exchanging a couple of initial messages. 没有这个,我的通信代码就在建立连接并交换了一些初始消息后就死掉了。 I couldn't even figure out why. 我什至不知道为什么。 My code was supposed to just receive the message from one end point, inspect it and pass it to the other end point. 我的代码应该只是从一个端点接收消息,对其进行检查并将其传递给另一端点。 Without 1ms wait the client just froze. 没有1毫秒的等待,客户端只会冻结。 Then, after I killed my app, the client worked normally as it would received ALL data, no error detected whatsoever. 然后,在我杀死我的应用程序之后,客户端可以正常工作,因为它将接收所有数据,没有发现任何错误。

After a thorough investigation I ruled out all thread synchronization issues in my code. 经过彻底的调查,我排除了代码中所有线程同步的问题。 BTW, the code uses blocking operation, only synchronous methods and as few threads as possible. 顺便说一句,该代码使用阻塞操作,仅同步方法和尽可能少的线程。 BTW this yielded far better benchmark results than any asynchronous versions, since task creation is quite expensive compared to just directly exchanging messages mostly shorter than 1KB. BTW产生的基准测试结果比任何异步版本都要好得多,因为与直接交换通常小于1KB的消息相比,任务创建的成本非常高。

So my traffic inspection tool works stable, it's tested and benchmarked, however it drives me crazy I don't understand WHY it works. 因此,我的流量检查工具运行稳定,经过了测试和基准测试,但是让我发疯,我不明白为什么会这样。 Without Thread.Sleep(1) it doesn't. 没有Thread.Sleep(1) ,就没有。 All the data is exchanged properly, but the client app freezes until my app is killed. 所有数据已正确交换,但是客户端应用程序冻结,直到我的应用程序被杀死为止。

I also found out ca 5 microseconds of waiting is enough for it to work properly most of the time. 我还发现大约5微秒的等待时间足以使它在大多数时间正常工作。

You would probably think why didn't I just check if my code enter the blocking read operation and freezes. 您可能会想,为什么我不只是检查我的代码是否进入了阻塞读取操作并冻结。 Heisenberg. 海森堡。 Any debugging like Debug.WriteLine or Console.WriteLine makes it work. Debug.WriteLineConsole.WriteLine类的任何调试都可以使它工作。 Well, sometimes it even work properly when run with Visual Studio Diagnostic Tools, but freezes when run without debugging. 好吧,有时甚至在使用Visual Studio诊断工具运行时它也可以正常工作,但是在运行时不进行调试就会冻结。

This issue here is very hard to reproduce, because it happens randomly and only when no debugging is involved. 此问题很难重现,因为它是随机发生的,并且仅在不涉及调试的情况下发生。 It was impossible to reproduce using any test code, only with real world client applications. 仅使用真实世界的客户端应用程序,不可能使用任何测试代码进行复制。 When I tried to test it connecting .NET based clients I wrote for testing purposes it never froze. 当我尝试连接基于.NET的客户端进行测试时,我出于测试目的而编写的程序从未冻结。

So, why should I wait before reading the stream and what am I waiting for? 那么,为什么要在阅读信息流之前先等待,我还在等什么呢?

BTW, the protocol I transmit uses TLV encoding, so I don't read greedy, I read messages, first the header telling how long the message will be, then the data, which can be incomplete, if I read less than specified in header, I wait for more data to complete the message. 顺便说一句,我传输的协议使用TLV编码,所以我不阅读贪婪,而是阅读消息,首先是标头说明消息将持续多久,然后是数据,如果我读的内容少于标头中指定的内容,则数据可能是不完整的,我等待更多数据完成消息。 I noticed that the freezes I described happen only if I get incomplete messages - the ones that need to be read in more than one read operation. 我注意到,我描述的冻结仅在收到不完整的消息时才会发生-需要在多个读取操作中读取的消息。 But then again - if there's a bug in my message handling, why does it work perfectly when 1ms lag is introduced? 但是再说一次-如果我的消息处理中有错误,为什么在引入1毫秒的延迟后它可以完美工作?

I understand this problem is very esoteric and probably described vaguely, some would miss the whole source code. 我了解到这个问题非常深奥,可能描述得很模糊,有些问题会遗漏整个源代码。 But this is what it is. 但这就是事实。 It is not a simple case and cannot be reduced to one. 这不是一个简单的案例,不能简化为一个案例。 I already tried to test individual parts of the system and found out that this problem exists only in this particular case: TLV message encoding, reading non-greedy, real live LDAP client reading directory tree. 我已经尝试测试系统的各个部分,并发现此问题仅在以下特定情况下存在:TLV消息编码,读取非贪婪,实时LDAP客户端实时读取目录树。 If I change anything, like for example the reads to greedy (always maximum avalilable number of bytes) - the issue won't occur. 如果我进行了任何更改,例如将读取更改为贪婪(总是最大可用字节数),则不会发生此问题。 If I try to read the directory with my own .NET application - the issue won't occur. 如果我尝试使用自己的.NET应用程序读取目录,则不会发生此问题。 If I insert any debug instructions before reading the stream - the issue won't occur. 如果我在读取流之前插入了任何调试指令,则不会发生此问题。

The simplest possible case I tested it is using my code as LDAP proxy and reading directory tree with LdapAdmin. 我测试过的最简单的情况是使用我的代码作为LDAP代理,并使用LdapAdmin读取目录树。 I know it's not this particular client fault, because LDP freezes too, but not always. 我知道这不是特定的客户端故障,因为LDP也冻结了,但并非总是冻结。

I hate to say it, but waiting (blocking) on a socket to become available is a terrible terrible idea. 我讨厌这样说,但是在套接字上等待 (阻塞)可用是一个可怕的可怕主意。 Did you know that you can do a zero-length async read, and it will invoke the callback for you when data is available, without you having to give it a buffer in advance? 您是否知道可以进行零长度异步读取,并且在数据可用时它将为您调用回调,而无需事先为其提供缓冲区? (or at least you can share and re-use a static readonly byte[] ZeroLengthBuffer = new byte[0]; ) Then when data is available, you get a sensible buffer and do some actual reading (sync or async, up to you) - perhaps keeping an eye on .Available on the socket. (或者至少您可以共享并重复使用static readonly byte[] ZeroLengthBuffer = new byte[0]; )然后, 数据可用时,您将获得一个有意义的缓冲区并进行一些实际读取(同步或异步,具体取决于您) )-也许要留意。套接字上.Available This zero-length read trick works for either of the Socket async read methods, IIRC - so Socket.BeginReceive or Socket.ReceiveAsync (which despite the name, is not async in the async / await sense, and may complete synchronously). 零长度读取技巧适用于IIRC的任一Socket异步读取方法-因此Socket.BeginReceiveSocket.ReceiveAsync (尽管名称如此,但在async / await意义上不是async的,并且可能是同步完成的)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM