简体   繁体   English

TCP 连接在 ASP.NET + Azure 中停止

[英]TCP connections are stopped in ASP.NET + Azure

I have a real head scratcher here (for me).我真的很头疼(对我来说)。

I have the following setup:我有以下设置:

  • Kube.netes Cluster in Azure (linux VMs) Azure 中的 Kube.netes 集群(linux 虚拟机)
  • ASP.NET docker image with TCP server ASP.NET docker 图片与 TCP 服务器
  • Software simulating TCP clients软件模拟TCP客户端
  • RabbitMQ for notifying incoming messages RabbitMQ 用于通知传入消息

Peer behaviour:同伴行为:

  • The client sends its heartbeat every 10 minutes客户端每 10 分钟发送一次心跳
  • The server sends a keep-alive every 5 minutes (nginx-ingress kills connections after being idle for ~10 minutes)服务器每 5 分钟发送一次 keep-alive(nginx-ingress 在空闲约 10 分钟后终止连接)

I am testing the performance of my new TCP server.我正在测试新的 TCP 服务器的性能。 The previous one, written in Java, could easily handle the load I am about to explain.前一个写在 Java 中,可以轻松处理我将要解释的负载。 For some reason, the new TCP server, written in C#, loses the connection after about 10-15 minutes.由于某种原因,新的 TCP 服务器,写在 C#,大约 10-15 分钟后就失去了连接。

Here is what I do:这是我所做的:

  • Use the simulator to start 500 clients with a ramp-up of 300s使用模拟器启动 500 个客户端,启动时间为 300 秒
  • All connections are there established correctly所有连接都已正确建立
  • Most of the time, the first heartbeats and keep-alives are sent and received大多数时候,发送和接收第一个心跳和保活
  • After 10+ minutes, I receive 0 bytes from Stream.EndRead() on BOTH ends of the connection. 10 多分钟后,我在连接的两端从Stream.EndRead()收到 0 个字节。

This is the piece of code that is triggering the error.这是触发错误的代码段。

var numberOfBytesRead = Stream.EndRead(result);
if (numberOfBytesRead == 0)
{
    This.Close("no bytes read").Sync(); //this is where I end up
    return;
}

In my logging on the server side, I see lots of disconnected ('no bytes read') lines and a lot of exceptions indicating that RabbitMQ is too busy: None of the specified endpoints were reachable .在我在服务器端的日志记录中,我看到很多disconnected ('no bytes read')行和很多异常表明 RabbitMQ is too busy: None of the specified endpoints were reachable

My guesses would be that the Azure Load Balancer just bounces the connections, but that does not happen with the Java TCP server.我的猜测是 Azure 负载均衡器只是反弹连接,但 Java TCP 服务器不会发生这种情况。 Or that the ASP.NET environment is missing some configuration.或者 ASP.NET 环境缺少一些配置。

Does anyone know how this is happening, and more important, how to fix this?有谁知道这是怎么发生的,更重要的是,如何解决这个问题?

--UPDATE #1-- --更新 #1--

I just used 250 devices and that worked perfectly.我只使用了 250 台设备并且效果很好。
I halved the ramp-up and that was a problem again.我将加速减半,这又是一个问题。 So this seems to be a performance issue.所以这似乎是一个性能问题。 A component in my chain is too busy.我链中的一个组件太忙了。

--UPDATE #2-- --更新 #2--

I disabled the publishing to RabbitMQ and it kept working now.我禁用了对 RabbitMQ 的发布,它现在继续工作。 Now I have to fix the RabbitMQ performance.现在我必须修复 RabbitMQ 性能。

I ended up processing the incoming data in a new Task.我最终在一个新任务中处理了传入的数据。 This is my code now:现在这是我的代码:

public void ReceiveAsyncLoop(IAsyncResult? result = null)
{
    try
    {
        if (result != null)
        {
            var numberOfBytesRead = Stream.EndRead(result);
            if (numberOfBytesRead == 0)
            {
                This.Close("no bytes read").Sync();
                return;
            }

            var newSegment = new ArraySegment<byte>(Buffer.Array!, Buffer.Offset, numberOfBytesRead);

            // This.OnDataReceived(newSegment)); <-- previously this
            Task.Run(() => This.OnDataReceived(newSegment));
        }

        Stream.BeginRead(Buffer.Array!, Buffer.Offset, Buffer.Count, ReadingClient.ReceiveAsyncLoop, null);
    }
    catch (ObjectDisposedException) { /*ILB*/ } 
    catch (Exception ex)
    {
        Log.Exception(ex, $"000001: {ex.Message}");
    }
}

Now, everything is super fast.现在,一切都超级快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM