我应该向套接字读取/写入多少字节？

Question

I'm having some doubts about the number of bytes I should write/read through a socket in C on Unix. 我对应该在Unix上的C中通过套接字写入/读取的字节数有一些疑问。 I'm used to sending 1024 bytes, but this is really too much sometimes when I send short strings. 我习惯于发送1024个字节，但是当我发送短字符串时，有时确实太多了。

I read a string from a file, and I don't know how many bytes this string is, it can vary every time, it can be 10, 20 or 1000. I only know for sure that it's < 1024. So, when I write the code, I don't know the size of bytes to read on the client side, (on the server I can use strlen() ). 我从文件中读取了一个字符串，但我不知道该字符串有多少个字节，它每次都可以变化，可以是10、20或1000。我只能确定它小于1024。所以，当我编写代码，我不知道要在客户端读取的字节大小（在服务器上，我可以使用strlen() ）。 So, is the only solution to always read a maximum number of bytes (1024 in this case), regardless of the length of the string I read from the file? 那么，不管我从文件中读取的字符串的长度如何，始终读取最大字节数（在这种情况下为1024）的唯一解决方案是吗？

For instance, with this code: 例如，使用以下代码：

read(socket,stringBuff,SIZE);

wouldn't it be better if SIZE is 10 instead of 1024 if I want to read a 10 byte string? 如果我想读取10个字节的字符串，将SIZE为10而不是1024会更好吗？

Answer 1

In the code in your question, if there are only 10 bytes to be read, then it makes no difference whether SIZE is 10 bytes, 1,024 bytes, or 1,000,024 bytes - it'll still just read 10 bytes. 在问题代码中，如果只读取10个字节，则SIZE是10个字节，1,024个字节还是1,000,024个字节都没有区别-它仍然只读取10个字节。 The only difference is how much memory you set aside for it, and if it's possible for you to receive a string up to 1,024 bytes, then you're going to have to set aside that much memory anyway. 唯一的区别是为它留出了多少内存，并且如果有可能接收最多1,024字节的字符串，那么无论如何您将不得不留出那么多的内存。

However, regardless of how many bytes you are trying to read in, you always have to be prepared for the possibility that read() will actually read a different number of them. 但是，无论您尝试读取多少字节，始终必须做好准备，以使read()实际读取不同数量的字节。 Particularly on a network, when you can get delays in transmission, even if your server is sending a 1,024 byte string, less than that number of bytes may have arrived by the time your client calls read() , in which case you'll read less than 1,024. 尤其是在网络上，即使您的服务器正在发送1,024字节的字符串，当您在传输上遇到延迟时，在客户端调用read() ，到达的字节数也可能少于该字节数，在这种情况下，您将读取少于1,024。

So, you always have to be prepared for the need to get your input in more than one read() call. 因此，您始终必须为在多个read()调用中获得输入的需求做好准备。 This means you need to be able to tell when you're done reading input - you can't rely alone on the fact that read() has returned to tell you that you're done. 这意味着您需要知道何时读完输入-您不能仅依靠read()返回告诉您完成的事实。 If your server might send more than one message before you've read the first one, then you obviously can't hope to rely on this. 如果您的服务器在阅读第一个消息之前可能发送了多个消息，那么您显然不希望依靠此消息。

You have three main options: 您有三个主要选择：

Always send messages which are the same size, perhaps padding smaller strings with zeros if necessary. 始终发送大小相同的消息，如有必要，可能在较小的字符串中填充零。 This is usually suboptimal for a TCP stream. 对于TCP流，这通常不是最佳的。 Just read until you've received exactly this number of bytes. 只需阅读，直到您收到的字节数正好为止。
Have some kind of sentinel mechanism for telling you when a message is over. 有某种哨兵机制可以告诉您消息何时结束。 This might be a newline character, a CRLF , a blank line, or a single dot on a line followed by a blank line, or whatever works for your protocol. 这可能是换行符， CRLF ，空行或行上的单个点后接空行，或可能适用于您的协议的任何行。 Keep reading until you have received this sentinel. 继续阅读，直到收到此标记。 To avoid making inefficient system calls of one character at a time, you need to implement some kind of buffering mechanism to make this work well. 为了避免一次只对一个字符进行低效率的系统调用，您需要实现某种缓冲机制以使其正常工作。 If you can be sure that your server is sending you lines terminated with a single '\\n' character, then using fdopen() and the standard CI/O library may be an option. 如果可以确定服务器向您发送以单个'\\n'字符结尾的行，则可以使用fdopen()和标准CI / O库。
Have your server tell you how big the message is (either in an initial fixed length field, or using the same kind of sentinel mechanism from point 2), and then keep reading until you've got that number of bytes. 让您的服务器告诉您消息的大小（在初始固定长度字段中，或者从第2点开始使用相同类型的哨兵机制），然后继续阅读直到获得该字节数为止。

Answer 2

The read() system call blocks until it can read one or more bytes, or until an error occurs. read()系统调用将阻塞，直到它可以读取一个或多个字节，或者直到发生错误为止。

It DOESN'T guarantee that it will read the number of bytes you request! 它不保证它将读取您请求的字节数！ With TCP sockets, it's very common that read() returns less than you request, because it can't return bytes that are still propagating through the network. 对于TCP套接字， read()返回的返回少于您的请求是很常见的，因为它无法返回仍在网络中传播的字节。

So, you'll have to check the return value of read() and call it again to get more data if you didn't get everything you wanted, and again, and again, until you have everything. 因此，如果您没有获得所需的所有内容，则必须检查read()的返回值，然后再次调用它以获取更多数据，一次又一次，直到获得所有内容为止。

我应该向套接字读取/写入多少字节？

问题描述

2 个解决方案

解决方案1
4 已采纳 2014-11-03 03:04:46

解决方案2
2 2014-11-03 03:04:35

我应该向套接字读取/写入多少字节？

问题描述

2 个解决方案

解决方案1 4 已采纳 2014-11-03 03:04:46

解决方案2 2 2014-11-03 03:04:35

解决方案1
4 已采纳 2014-11-03 03:04:46

解决方案2
2 2014-11-03 03:04:35