使用lwIP的原始TCP API时的性能问题

Question

I use lwIP to add networking functionality to my system. 我使用lwIP为我的系统添加网络功能。 On my platform i built a buffer that i want to send every time it is full. 在我的平台上，我构建了一个缓冲区，我想在每次充满时发送它。 This can happen quite quickly. 这种情况很快就会发生。 The system is directly connected to a switch in a private LAN. 系统直接连接到专用LAN中的交换机。 Initially the sending of data had a very large time gap in between of 2 seconds. 最初，数据的发送在2秒之间具有非常大的时间间隔。 Additionally the packets had a size of 720 bytes if my memory serves me correctly. 此外，如果我的内存正确地为我服务，数据包的大小为720字节。 The used buffer currently has capacity for about 20000 bytes and I might decide to increase this in the future. 使用的缓冲区目前有大约20000字节的容量，我可能决定在将来增加它。 The network has 100 mbit speeds and I would like to come close to these speeds on my platform. 网络有100兆位的速度，我想在我的平台上接近这些速度。

When searching for the cause of the slow speeds I ended up at lwIP's configuration. 当搜索速度慢的原因时，我最终得到了lwIP的配置。 Prior to that I altered my sending mechanism. 在此之前，我改变了我的发送机制。 I use the raw lwIP API and at present I write the data as follows: 我使用原始lwIP API，目前我写的数据如下：

tcp_write(<pcb>, (const void*) data, <bytes>, TCP_WRITE_FLAG_COPY);
//<bytes> is at most tcp_sndbuf(<pcb>)

I know the copy flag creates a performance hit, but this is added because I don't want to overwrite data before it's actually sent. 我知道复制标志会产生性能损失，但这是因为我不想在实际发送之前覆盖数据。 (and the flag isn't the main problem but something to polish once it's working properly) In a prior solution i omitted the flag and simply waited for all bytes to be ACK'd (after forcing data to be send after writing by calling tcp_output()) by using the callback function. （并且标志不是主要问题，但是一旦它正常工作就要抛光）在先前的解决方案中，我省略了标志并且只是等待所有字节被确认（在通过调用tcp_output强制写入数据之后）（））通过使用回调函数。 (This might be worse performance wise and I dont think it's related) （这可能是更糟糕的表现，我不认为这是相关的）

I played a little with the settings in the of lwIP and that seemed to make some difference. 我玩了一下lwIP中的设置，这似乎有所不同。 I think the window size especially made a difference although I'm not quite sure. 我认为窗户尺寸特别有所不同，虽然我不太确定。 At the moment I increased the window size significantly and even though I get a burst of packets with about 2ms between them (instead of 2s!) this is followed with a long period of "nothing" and then a burst again. 目前，我大大增加了窗口大小，即使我得到一个数据包突发，它们之间约2毫秒（而不是2秒！），然后是长时间的“无”，然后再次突发。 I want it to continuously send at the speed it should be capable of which should be 100 mbit at most but atleast 10 mbit is not weird to expect, right? 我希望它能够以最快100 mbit的速度连续发送，但至少10 mbit并不奇怪，对吧？

I loaded up wireshark to see what was happening. 我装了wirehark看看发生了什么。

192.168.1.26 is my desktop computer running windows. 192.168.1.26是我的桌面计算机运行Windows。 192.168.1.192 is the embedded system that's using lwIP. 192.168.1.192是使用lwIP的嵌入式系统。

Initially I send a start request from the desktop to the lwIP system, letting the system know it should start sending the buffer each time it is full. 最初我从桌面向lwIP系统发送一个启动请求，让系统知道它应该在每次充满时开始发送缓冲区。 In case it is relevant, this is the corresponding part of the trace: 如果它是相关的，这是跟踪的相应部分：

5    2.007754    192.168.1.26    192.168.1.192    TCP    61    [TCP segment of a reassembled PDU]
6    2.008196    192.168.1.192    192.168.1.26    TCP    60    patrolview > afs3-fileserver [SYN] Seq=0 Win=65535 Len=0 MSS=1400
7    2.226238    192.168.1.192    192.168.1.26    TCP    60    afs3-fileserver > 50015 [ACK] Seq=1 Ack=8 Win=65528 Len=0
13    4.976858    192.168.1.192    192.168.1.26    TCP    60    patrolview > afs3-fileserver [ACK] Seq=1 Ack=1 Win=65535 Len=0
22    6.976572    192.168.1.192    192.168.1.26    TCP    60    [TCP segment of a reassembled PDU]
23    7.177903    192.168.1.26    192.168.1.192    TCP    54    50015 > afs3-fileserver [ACK] Seq=8 Ack=2 Win=64399 Len=0

I believe this is alright although i'm not certain. 我相信这没关系，虽然我不确定。 Anyway, after this the actual sending happens. 无论如何，在此之后实际发送。 The relevant trace looks as follows The start time is 207.992115 which should be considered the starting time. 相关跟踪如下所示开始时间为207.992115，应视为开始时间。 The difference between that and the 7.177903 is expected: 预计与7.177903之间存在差异：

2578    207.992115    192.168.1.192    192.168.1.26    Gryphon    1422    - Invalid -
2581    208.194336    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=1369 Win=64400 Len=0
2582    208.195880    192.168.1.192    192.168.1.26    TCP    1422    [TCP segment of a reassembled PDU]
2583    208.197035    192.168.1.192    192.168.1.26    TCP    1422    [TCP segment of a reassembled PDU]
2584    208.197134    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=4105 Win=64400 Len=0
2585    208.198712    192.168.1.192    192.168.1.26    TCP    1422    [TCP segment of a reassembled PDU]
2586    208.199867    192.168.1.192    192.168.1.26    TCP    1422    [TCP segment of a reassembled PDU]
2587    208.199965    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=6841 Win=64400 Len=0
2588    208.200927    192.168.1.192    192.168.1.26    TCP    1314    [TCP segment of a reassembled PDU]
2590    208.397469    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=8101 Win=63140 Len=0

It seems that I currently send things quicker than the desktop is ACKing. 似乎我目前发送的东西比桌面确认更快。 The traffic after the trace above shows as black bars and looks like: 上面跟踪后的流量显示为黑条，看起来像：

2591    208.399051    192.168.1.192    192.168.1.26    TCP    1422    [TCP Previous segment lost] [TCP segment of a reassembled PDU]
2592    208.399136    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2590#1] afs3-fileserver > patrolview [ACK] Seq=1 Ack=8101 Win=63140 Len=0
2593    208.400208    192.168.1.192    192.168.1.26    Gryphon    1422   
2594    208.400285    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2590#2] afs3-fileserver > patrolview [ACK] Seq=1 Ack=8101 Win=63140 Len=0
2595    208.401361    192.168.1.192    192.168.1.26    Gryphon    1422    - Invalid -
2596    208.401445    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2590#3] afs3-fileserver > patrolview [ACK] Seq=1 Ack=8101 Win=63140 Len=0
2597    208.402425    192.168.1.192    192.168.1.26    Gryphon    1314   
2598    208.402516    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2590#4] afs3-fileserver > patrolview [ACK] Seq=1 Ack=8101 Win=63140 Len=0
2599    208.403588    192.168.1.192    192.168.1.26    Gryphon    1422    [TCP Fast Retransmission] Command response
2600    208.403685    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=14833 Win=64400 Len=0
2605    209.992237    192.168.1.192    192.168.1.26    Gryphon    1422    - Invalid -
2607    210.200219    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=16201 Win=63032 Len=0
2608    210.201819    192.168.1.192    192.168.1.26    Gryphon    1422    [TCP Previous segment lost] - Invalid -
2609    210.201903    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2607#1] afs3-fileserver > patrolview [ACK] Seq=1 Ack=16201 Win=63032 Len=0
2609    210.201903    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2607#1] afs3-fileserver > patrolview [ACK] Seq=1 Ack=16201 Win=63032 Len=0
2611    210.203070    192.168.1.26    192.168.1.192    TCP    54    [TCP Dup ACK 2607#2] afs3-fileserver > patrolview [ACK] Seq=1 Ack=16201 Win=63032 Len=0
2955    345.001223    192.168.1.192    192.168.1.26    Gryphon    1422    [TCP Retransmission]

Now after this point there is a huge delay that I cannot explain. 在此之后，有一个巨大的延迟，我无法解释。 The next packets arrive at 345 seconds, this is a 135 second difference. 下一个数据包到达345秒，这是一个135秒的差异。 (although in most cases it was a bit less, but still wayy too high) It starts as follows: （虽然在大多数情况下它有点少，但仍然太过分了）它开始如下：

2955    345.001223    192.168.1.192    192.168.1.26    Gryphon    1422    [TCP Retransmission]
2958    345.001707    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=20305 Win=64400 Len=0
2959    345.003336    192.168.1.192    192.168.1.26    TCP    1422    [TCP segment of a reassembled PDU]
2960    345.004395    192.168.1.192    192.168.1.26    TCP    1314    [TCP segment of a reassembled PDU]
2961    345.004494    192.168.1.26    192.168.1.192    TCP    54    afs3-fileserver > patrolview [ACK] Seq=1 Ack=22933 Win=64400 Len=0

etc. 等等

Later on a similar problems occur although the mentioned delay is shorter. 后来虽然提到的延迟较短，但仍会出现类似的问题。 My question is: How can I fix the problem of the slow sending from my platform and how should i configure my lwIP settings to expect decent/good results? 我的问题是：我如何解决从我的平台发送缓慢的问题，我应该如何配置我的lwIP设置以期望体面/良好的结果？ I want to send the data at fast speeds. 我想以快速的速度发送数据。 (my network is capable of 100Mbps, the closer the better) I think i currently messed up my settings entirely but I am not sure how to finetune them for my needs. （我的网络能够达到100Mbps，越接近越好）我认为我目前完全搞砸了我的设置，但我不确定如何根据我的需要对它们进行微调。 Here are some (hopefully) relevant settings from my lwipopts.h 以下是我的lwipopts.h中的一些（希望）相关设置

file: 文件：

#define MEM_SIZE                        65000
#define PBUF_POOL_SIZE                  1024
#define IP_FRAG_USES_STATIC_BUF         0
#define TCP_WND                         65535
#define TCP_MSS                         1400
#define TCP_SND_BUF                     65535
#define PBUF_POOL_BUFSIZE               LWIP_MEM_ALIGN_SIZE(1528)
#define LWIP_TCP_KEEPALIVE              1
#define LWIP_SO_RCVTIMEO                1

Answer 1

I ran into similar problems using beaglebone and following setting 我使用beaglebone和以下设置遇到了类似的问题

#define MEM_SIZE                        (1024 * 1024) /* 1MiB */
#define MEMP_NUM_PBUF                   1024
#define MEMP_NUM_TCP_PCB                32
#define PBUF_POOL_SIZE                  1024
#define TCP_MSS                         1460
#define TCP_WND                         (4*TCP_MSS)
#define TCP_SND_BUF                     65535
#define TCP_OVERSIZE                    TCP_MSS
#define TCP_SND_QUEUELEN                512
#define MEMP_NUM_TCP_SEG                512

Using a tcp_sent polled function I basically checked how many bytes I have in the buffer and filled them in the polled function itself immediately with another samples. 使用tcp_sent轮询函数，我基本上检查了缓冲区中有多少字节，并使用另一个样本立即在轮询函数中填充它们。 This was to check the throughout 这是为了检查整个过程

I was quite surprised that wire shark shown bursts of packets during roughly few milliseconds and then 700ms of nothing . 我很惊讶，线鲨在大约几毫秒内显示出数据包突发，然后是700 毫秒。

Going deep into the stack I've found, that this happens exactly at the moment there are 65535 bytes sent (or roughly around this). 深入到堆栈中我发现，这种情况的发生正好在此刻有送（或大致在此）65535个字节。

All that was solved by DISABLING MEMORY SANITY CHECK 所有这些都是通过禁用内存SANITY检查解决的

Look in your lwipopts.h , whether by chance you do not define somewhere: 看看你的lwipopts.h ，不管你是不是在某个地方定义：

#define MEMP_SANITY_CHECK 1

If so, remove that line or set it to zero. 如果是，请删除该行或将其设置为零。 As such, my packet sending performance did not increase (I'm still around 11Mbits when firing on maximum speed the data), but the total throughput increased considerably as the time between two sent packet is now constant and takes roughly 100us. 因此，我的数据包发送性能没有增加（当以最大速度发送数据时，我仍然大约为11Mbits），但是总吞吐量显着增加，因为两个发送数据包之间的时间现在是恒定的并且大约需要100us。

It needs to be said, that this still does not resolve the issue of having only 11MBits on 100MBit line completely dedicated only to this equipment 需要说明的是，这仍然无法解决100MBit线路上只有11MBits完全专用于此设备的问题

使用lwIP的原始TCP API时的性能问题

问题描述

1 个解决方案

解决方案1
4 2013-03-21 17:37:54

使用lwIP的原始TCP API时的性能问题

问题描述

1 个解决方案

解决方案1 4 2013-03-21 17:37:54

解决方案1
4 2013-03-21 17:37:54