简体   繁体   English

Windows下的奇怪的tcp死锁

[英]Odd tcp deadlock under windows

We are moving large amounts of data on a LAN and it has to happen very rapidly and reliably. 我们正在移动LAN上的大量数据,并且必须非常快速和可靠地进行。 Currently we use windows TCP as implemented in C++. 目前我们使用在C ++中实现的Windows TCP。 Using large (synchronous) sends moves the data much faster than a bunch of smaller (synchronous) sends but will frequently deadlock for large gaps of time (.15 seconds) causing the overall transfer rate to plummet. 使用大(同步)发送比一堆较小(同步)发送更快地移动数据,但是经常会在大的时间间隔(.15秒)内死锁,导致整体传输速率骤降。 This deadlock happens in very particular circumstances which makes me believe it should be preventable altogether. 这种僵局发生在非常特殊的情况下,这使我相信它应该完全可以预防。 More importantly if we don't really know the cause we don't really know it won't happen some time with smaller sends anyway. 更重要的是,如果我们真的不知道原因我们真的不知道它不会发生在一段时间内使用较小的发送。 Can anyone explain this deadlock? 谁能解释这个僵局?

Deadlock description (OK, zombie-locked, it isn't dead, but for .15 or so seconds it stops, then starts again) 死锁描述(好的,僵尸锁定,它没有死,但是.15秒左右它停止,然后再次启动)

  1. The receiving side sends an ACK. 接收方发送ACK。
  2. The sending side sends a packet containing the end of a message (push flag is set) 发送方发送包含消息结束的数据包(设置推送标志)
  3. The call to socket.recv takes about .15 seconds(!) to return 对socket.recv的调用大约需要.15秒(!)才能返回
  4. About the time the call returns an ACK is sent by the receiving side 关于呼叫返回的时间,接收方发送ACK
  5. The the next packet from the sender is finally sent (why is it waiting? the tcp window is plenty big) 最后发送来自发送方的下一个数据包(为什么它在等待?tcp窗口很大)

The odd thing about (3) is that typically that call doesn't take much time at all and receives exactly the same amount of data. 关于(3)的奇怪之处在于,通常该调用不需要花费太多时间并且接收完全相同数量的数据。 On a 2Ghz machine that's 300 million instructions worth of time. 在2Ghz的机器上,有3亿条指令值得花时间。 I am assuming the call doesn't (heaven forbid) wait for the received data to be acked before it returns, so the ack must be waiting for the call to return, or both must be delayed by something else. 我假设呼叫没有(天堂禁止)等待收到的数据在返回之前被激活,所以ack必须等待呼叫返回,否则两者必须被其他东西延迟。

The problem NEVER happens when there is a second packet of data (part of the same message) arriving between 1 and 2. That part very clearly makes it sound like it has to do with the fact that windows TCP will not send back a no-data ACK until either a second packet arrives or a 200ms timer expires. 当第二个数据包(同一个消息的一部分)到达1到2之间时,问题永远不会发生。这部分非常清楚地说明它与Windows TCP不会发回的事实有关 - 数据ACK,直到第二个数据包到达或200ms计时器到期。 However the delay is less than 200 ms (its more like 150 ms). 但是延迟小于200毫秒(更像是150毫秒)。

The third unseemly character (and to my mind the real culprit) is (5). 第三个不合时宜的角色(在我看来真正的罪魁祸首)是(5)。 Send is definitely being called well before that .15 seconds is up, but the data NEVER hits the wire before that ack returns. 发送肯定是在那之前调用好.15秒,但是数据永远不会在ack返回之前点击线路。 That is the most bizarre part of this deadlock to me. 对我来说,这是这个僵局中最离奇的部分。 Its not a tcp blockage because the TCP window is plenty big since we set SO_RCVBUF to something like 500*1460 (which is still under a meg). 它不是tcp阻塞,因为TCP窗口很大,因为我们将SO_RCVBUF设置为类似500 * 1460(仍然在meg以下)。 The data is coming in very fast (basically there is a loop spinning out data via send) so the buffer should fill almost immediately. 数据进入非常快(基本上有一个循环通过发送旋转数据)所以缓冲区应该几乎立即填充。 Msdn mentions that there various "heuristics" used in deciding when a send hits the wire, and that an already pending send + a full buffer will cause send to block until the data hits the wire (otherwise send apparently really just copies data into the tcp send buffer and returns). Msdn提到在决定发送命中何时使用各种“启发式”,并且已经挂起的发送+完整缓冲区将导致发送阻塞,直到数据到达线路(否则发送显然只是将数据复制到tcp发送缓冲区并返回)。

Anway, why the sender doesn't actually send more data during that .15 second pause is the most bizarre part to me. 不管怎样,为什么发送者实际上并没有发送更多数据.15秒暂停对我来说是最离奇的部分。 The information above was captured on the receiving side via wireshark (except of course the socket.recv return times which were logged in a text file). 上面的信息是通过wireshark在接收端捕获的(当然除了在文本文件中记录的socket.recv返回时间)。 We tried changing the send buffer to zero and turning off nagel on the sender (yes, I know nagel is about not sending small packets - but we tried turning nagel off in case that was part of the unstated "heuristics" affecting whether the message would be posted to the wire. Technically microsoft's nagel is that a small packet isn't sent if the buffer is full and there is an outstanding ACK, so it seemed like a possibility). 我们尝试将发送缓冲区更改为零并关闭发送方上的nagel(是的,我知道nagel不会发送小数据包 - 但我们尝试关闭nagel,以防这是未声明的“启发式”的一部分,影响消息是否会从技术上来说,微软的nagel就是如果缓冲区已满并且有一个未完成的ACK,则不会发送一个小数据包,所以它似乎是一种可能性。

The send blocking until the previous ACK is received almost certainly indicates that the TCP receive window is full (you can check this by using Wireshark to analyse the network traffic). 发送阻塞直到接收到前一个ACK几乎可以肯定地表明TCP接收窗口已满(您可以通过使用Wireshark来分析网络流量来检查)。

No matter how big your TCP window is, if the receiving application isn't processing data as fast as it's arriving then the TCP window will eventually fill up. 无论您的TCP窗口有多大,如果接收应用程序没有像到达那样快速处理数据,那么TCP窗口最终会填满。 How fast are we talking here? 我们在这里谈得多快? What is the receiving side doing with the data? 接收方对数据做了什么? (If you're writing the received data to disk then it's quite possible that your disk just can't keep up with a gigabit network at full bore). (如果您将接收到的数据写入磁盘,那么您的磁盘很可能无法满足千兆网络的需求)。


OK, so you have a 730,000 byte receive window and you're streaming data at 480Mbps. 好的,所以你有一个730,000字节的接收窗口,你正在以480Mbps的速度传输数据。 That means it takes only 12ms to entirely fill your window - so when the 150ms delay on the receive side occurs, the receive window fills up almost instantly and causes the sender to stall. 这意味着完全填满你的窗口只需要12ms - 所以当接收端的150ms延迟发生时,接收窗口几乎立即填满并导致发送器停止。

So your root cause is this 150ms delay in scheduling your receive process. 因此,您的根本原因是计划接收过程的延迟时间为150毫秒。 There's any number of things that could cause that (it could be as simple as the kernel needing to flush dirty pages to disk to create some more free pages for your application); 有许多事情可能导致这种情况(可能就像内核需要将脏页面刷新到磁盘以便为您的应用程序创建更多空闲页面一样简单); you could try increasing your processes scheduling priority, but there's no guarantee that that will help. 你可以尝试增加你的进程调度优先级,但不能保证这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM