简体   繁体   English

通过TCP处理消息

[英]Handling messages over TCP

I'm trying to send and receive messages over TCP using a size of each message appended before the it starts. 我正在尝试使用开始之前附加的每个消息的大小通过TCP发送和接收消息。

Say, First three bytes will be the length and later will the message: 说,前三个字节为长度,随后为消息:

As a small example: 作为一个小例子:

005Hello003Hey002Hi

I'll be using this method to do large messages, but because the buffer size will be a constant integer say, 200 Bytes. 我将使用此方法来处理大型消息,但是由于缓冲区大小将是一个恒定的整数,即200字节。 So, there is a chance that a complete message may not be received eg instead of 005Hello I get 005He nor a complete length may be received eg I get 2 bytes of length in message. 因此,有可能没有收到完整的消息,例如代替005Hello我收到005He也可能没有收到完整的长度,例如,我得到2字节的消息长度。

So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc. 因此,要解决此问题,我需要等待下一条消息并将附加到不完整的消息等中。

My question is: Am I the only one having these difficulties to appending messages to each other, appending lengths etc.. to make them complete Or is this really usually how we need to handle the individual messages on TCP? 我的问题是:我是唯一一个在将消息彼此附加,附加长度等方面有困难的人。以使它们完整吗?或者这真的我们通常需要在TCP上处理单个消息的方式吗? Or, if there is a better way? 或者,如果有更好的方法?

What you're seeing is 100% normal TCP behavior. 您所看到的是100%正常的TCP行为。 It is completely expected that you'll loop receiving bytes until you get a "message" (whatever that means in your context). 完全可以预期,您将循环接收字节,直到获得“消息”(无论在您的上下文中是什么意思)。 It's part of the work of going from a low-level TCP byte stream to a higher-level concept like "message". 这是从低级别的TCP字节流到更高级别的概念(如“消息”)的一部分。

And "usr" is right above. 并且“ usr”在上方。 There are higher level abstractions that you may have available. 您可能有更高级别的抽象。 If they're appropriate, use them to avoid reinventing the wheel. 如果合适,请使用它们以避免重新发明轮子。

So, there is a chance that a complete message may not be received eg instead of 005Hello I get 005He nor a complete length may be received eg I get 2 bytes of length in message. 因此,有可能没有收到完整的消息,例如代替005Hello,我收到005He,也可能没有收到完整的长度,例如,我得到2字节的消息长度。

Yes. 是。 TCP gives you at least one byte per read, that's all. TCP每次读取至少给您一个字节,仅此而已。

Or is this really usually how we need to handle the individual messages on TCP? 还是这真的是我们通常需要如何处理TCP上的各个消息的方法? Or, if there is a better way? 或者,如果有更好的方法?

Try using higher-level primitives. 尝试使用更高级别的基元。 For example, BinaryReader allows you to read exactly N bytes (it will internally loop). 例如, BinaryReader允许您精确地读取N个字节(它将在内部循环)。 StreamReader lets you forget this peculiarity of TCP as well. StreamReader还可让您忘记TCP的这种特殊性。

Even better is using even more higher-level abstractions such as HTTP (request/response pattern - very common), protobuf as a serialization format or web services which automate pretty much all transport layer concerns. 更好的方法是使用更高级别的抽象,例如HTTP(请求/响应模式-非常常见),protobuf作为序列化格式或可自动处理几乎所有传输层问题的Web服务。

Don't do TCP if you can avoid it. 如果可以避免,请不要执行TCP。

So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc. 因此,要解决此问题,我需要等待下一条消息并将其附加到不完整的消息等中。

Yep, this is how things are done at the socket level code. 是的,这是在套接字级别代码上完成工作的方式。 For each socket you would like to allocate a buffer of at least the same size as kernel socket receive buffer, so that you can read the entire kernel buffer in one read/recv/resvmsg call. 您希望为每个套接字分配一个至少与内核套接字接收缓冲区大小相同的缓冲区,以便您可以在一个read/recv/resvmsg调用中读取整个内核缓冲区。 Reading from the socket in a loop may starve other sockets in your application (this is why they changed epoll to be level-triggered by default, because the default edge-triggered forced application writers to read in a loop). 在循环中从套接字读取可能会使您的应用程序中的其他套接字饿死(这就是为什么它们将epoll更改为默认级别触发的原因,因为默认的边缘触发强制应用程序编写者在循环中读取)。

The first incomplete message is always kept in the beginning of the buffer, reading the socket continues at the next free byte in the buffer, so that it automatically appends to the incomplete message. 第一个未完成的消息始终保存在缓冲区的开头,读取套接字的位置在缓冲区的下一个空闲字节处继续,因此它将自动追加到未完成的消息中。

Once reading is done, normally a higher level callback is called with the pointers to all read data in the buffer. 读取完成后,通常会使用指向缓冲区中所有读取数据的指针来调用更高级别的回调。 That callback should consume all complete messages in the buffer and return how many bytes it has consumed (may be 0 if there is only an incomplete message). 该回调应使用缓冲区中的所有完整消息,并返回已消耗的字节数(如果只有不完整的消息,则可以为0)。 The buffer management code should memmove the remaining unconsumed bytes (if any) to the beginning of the buffer. 缓冲器管理代码应memmove剩余未消耗的字节(如果有的话)到缓冲器的开始。 Alternatively, a ring-buffer can be used to avoid moving those unconsumed bytes, but in this case the higher level code should be able to cope with ring-buffer iterators, which it may be not ready to. 或者,可以使用环形缓冲区来避免移动那些未使用的字节,但是在这种情况下,更高级别的代码应该能够处理可能尚未准备好的环形缓冲区迭代器。 Hence keeping the buffer linear may be the most convenient option. 因此,保持缓冲区线性可能是最方便的选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM