简体   繁体   English

在Linux中,我如何知道某个TCP数据包是否收到了ACK?

[英]In Linux, how do I know if an ACK is received to a certain TCP packet?

Long story short: in Linux, how do I ensure that an ACK message is received for a certain TCP packet? 简而言之:在Linux中,如何确保收到某个TCP数据包的ACK消息?

Full story: 全文:

I'm debugging an Asterisk/OpenH323 <-> Panasonic IP-GW16 issue. 我正在调试Asterisk / OpenH323 < - > Panasonic IP-GW16问题。

An H323 connection involves two sessions: H225.0 and H245. H323连接涉及两个会话:H225.0和H245。 These are just two TCP sessions, over which some data are transmitted. 这些只是两个TCP会话,其中传输了一些数据。

Let's call them Session 1 (for H225.0) and Session 2 (for H245). 我们称它们为Session 1 (H225.0)和Session 2 (H245)。

Session 1 has well-known TCP port number of 1720, while port for Session 2 is chosen at runtime. Session 1具有众所周知的TCP端口号1720,而Session 2端口在运行时选择。

Control flow goes as follows: 控制流程如下:

  1. Panasonic calls Asterisk: it opens Session 1 (TCP/1720) to Asterisk and sends a SETUP message over Session 1 , which contains the port 2 that Panasonic will listen to. 松下称为Asterisk:它向Asterisk打开Session 1 (TCP / 1720),并通过Session 1发送SETUP消息,其中包含Panasonic将收听的port 2
  2. Asterisk sends Panasonic a CALL PROCEEDING message over Session 1 Asterisk通过Session 1向Panasonic发送了一条CALL PROCEEDING消息
  3. Panasonic starts listening on the port 2 松下开始收听port 2
  4. Panasonic sends a TCP ACK over Session 1 . Panasonic通过Session 1发送TCP ACK。
  5. Asterisk opens TCP Session 2 on port 2 . Asterisk在port 2上打开TCP Session 2 port 2

Order of steps 2 and 3 is important: Panasonic will not listen to port 2 unless it has received a CALL PROCEEDING message on step 2 . 步骤2和3的顺序很重要:除非在step 2收到CALL PROCEEDING消息,否则Panasonic不会收听port 2 step 2

But in OpenH323 code, step 2 and step 5 are only several lines away. 但在OpenH323代码中, step 2step 5距离只有几行。

That's why connection sometimes works in debug mode and quite never works in release. 这就是为什么连接sometimes在调试模式下工作并且在发布时quite never不起作用的原因。

It is clearly seen in a packet dump. 在数据包转储中可以清楚地看到它。 I made a series of experiments, and in 52 cases out of 52, if step 5 goes before step 4 , the connection fails; 我做了一系列的实验,在52个中的52个案例中,如果step 5step 4之前进行,则连接失败; if not, the connection succeeds. 如果没有,连接成功。

There are no other messages sent from Panasonic except that ACK in step 4 , and it seems that the only way Asterisk may know that port 2 is listened to is receiving that ACK. 除了step 4中的ACK之外,没有其他消息从Panasonic发送,并且似乎Asterisk可能知道port 2被监听的唯一方式是接收该ACK。

Of course I may implement a timed wait, but I want a cleaner solution. 当然我可以实现定时等待,但我想要一个更清洁的解决方案。

So the question goes again: after sending a message over TCP connection in step 2 , how do I know if an ACK is received to the packet containing the message? 所以问题再次出现:在step 2通过TCP连接发送消息后,如何知道是否收到包含该消息的数据包的ACK?

In this specific case, I'd say you would find that your tcp_info structure would hold a nonzero tcp_info.tcpi_unacked . 在这个特定的情况下,我会说你会发现你的tcp_info结构将保持非零tcp_info.tcpi_unacked You'd get this via getsockopt(TCP_INFO) . 你可以通过getsockopt(TCP_INFO)得到这个。

Note: unstable interface apparently. 注意:界面不稳定。

A TCP-level ACK is sent by the operating system and can be sent before the data is read by the process. TCP级别的ACK由操作系统发送,并且可以在进程读取数据之前发送。 So if you receive the ACK it doesn't mean that the remote application acted on the message or was even notified of it's existence. 因此,如果您收到ACK,则并不意味着远程应用程序对该消息采取行动或甚至通知它存在。

Just imagine: if TCP acks we're a message confirmation then the application should read() the message, process it (which can take a while) and then call a "read_ok" syscall. 试想一下:如果TCP确认我们是消息确认,那么应用程序应该读取()消息,处理消息(可能需要一段时间),然后调用“read_ok”系统调用。 As far as I know this is impossible with the standard socket API. 据我所知,使用标准套接字API是不可能的。

You might be able to check if there's any unacked data with the SIOCOUTQ ioctl (man 7 tcp). 您可以使用SIOCOUTQ ioctl(man 7 tcp)检查是否有任何未包装的数据。 But it's not reliable solution to your problem. 但它不是你问题的可靠解决方案。

Are you sure this is how h323 is supposed to work? 你确定这是h323的工作方式吗? What if port2 is invalid or taken by another connection? 如果port2无效或被另一个连接占用怎么办? A confirmation or error should be sent back. 应该发回确认或错误。

The timing sequence seems odd, although if the Panasonic is using a proprietary O/S that might explain it. 虽然如果松下使用可能解释它的专有O / S,时序似乎很奇怪。

To clarify - AIUI - if the Panasonic was running a "normal" O/S, the ACK sent by it in stage 4 would happen immediately after the Panasonic's software has read() the data from the control TCP socket. 澄清 - AIUI - 如果松下运行“正常”操作系统,它在第4阶段发送的ACK将在Panasonic的软件read()来自控制TCP套接字的数据后立即发生。

Similarly the OpenH323 code call to write() (in step 2) shouldn't return ( assuming that it's not a non-blocking socket !) until the ACK from the Panasonic has been received by the Asterisk server. 类似地,在Asterisk服务器收到来自Panasonic的ACK之前,对write() (在步骤2中)的OpenH323代码调用不应返回( 假设它不是非阻塞套接字 !)。 That is how you're supposed to know that the ACK has been received. 就是你应该知道已收到ACK的方式。

Essentially it seems that the Panasonic isn't doing the equivalent of listen() on the second socket until after it has read() the CALL PROCEEDING message. 从本质上讲,松下似乎没有在第二个套接字上执行listen() ,直到它read() CALL PROCEEDING消息。 It looks like a race condition - sometimes Open323 will try to connect() before the other end is ready. 它看起来像竞争条件 - 有时Open323会在另一端准备好之前尝试connect()

When this happens, do you get ECONNREFUSED at the OpenH23 end? 发生这种情况时,你会在OpenH23端获得ECONNREFUSED吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM