简体繁体 English

将 GB 的字节写入 go 中的 TCP 套接字是一种好习惯吗？

[英]Is it a good practice to write a GB of byes into a TCP socket in one go?

原文 2021-05-04 11:07:29 4 2 tcp/ tcpsocket

I am maintaining some matured production code which sends data over TCP sockets.我正在维护一些成熟的生产代码，这些代码通过 TCP sockets 发送数据。 It always breaks large chunk of data into many packets, each 1000 bytes.它总是将大块数据分成许多数据包，每个数据包 1000 字节。 I just wonder why it was done this way.我只是想知道为什么这样做。 Why can't I just write a GB worth of a byte array into the socket in one go?为什么我不能将一个 GB 的字节数组写入一个 go 的套接字中？ What are the cons to do that?这样做有什么坏处？

2 个解决方案

Depending on the underlying implementation, attempting to send 1GB in one bulk may result in the 1GB being copied into some cache, and then held there for some time.根据底层实现，尝试批量发送 1GB 可能会导致 1GB 被复制到某个缓存中，然后在那里保留一段时间。 So it can be a problem if not enough memory is available (and even if there is enough available memory - it may not be the most efficient way to make use of it).因此，如果没有足够的 memory 可用（即使有足够的可用 memory - 它可能不是使用它的最有效方式），这可能是一个问题。

Though "manually" splitting it into segments of 1000 bytes each does sound a bit like an overkill to me.尽管“手动”将其拆分为 1000 字节的段，但对我来说确实听起来有点矫枉过正。

There are many reasons not to throw a huge chunk in at once.有很多理由不一次投入大量资金。

First of all: Even on very fast networks sending a GB of data will take a non-trivial amount of time.首先：即使在非常快的网络上发送 1 GB 的数据也需要相当长的时间。 On a 10Gbps network it would take a little under 1 second, which is a long time in computer speak.在 10Gbps 网络上，这将需要不到 1 秒的时间，这在计算机语言中是很长的时间。 And that assumes that this one operation has all the bandwidth of the network available to it and doesn't have to share with anything else.并且假设这一操作具有可用的网络的所有带宽，并且不必与其他任何东西共享。

This means that if you successfully do a 1GB write call to a TCP socket, it will be some time until the later bits of data are actually sent.这意味着，如果您成功地对 TCP 套接字执行 1GB write调用，则需要一段时间才能实际发送后面的数据位。

And in that mean time you'll have to hold all that data in memory.同时，您必须将所有数据保存在 memory 中。 That means that you'll need to allocate and hold on to 1GB of data for that whole transaction.这意味着您需要为整个事务分配并保留 1GB 的数据。

If instead you fill a small-ish buffer and read from your source (or generate, depending on where the data comes from) before each write, then you'll need only a little memory (the size of the buffer).相反，如果您在每次写入之前填充一个小的缓冲区并从源读取（或生成，取决于数据的来源），那么您只需要一点 memory （缓冲区的大小）。

All of that might not sound like a big deal with todays machines, but consider that many servers will serve hundreds of clients/requests at once and if each one requires a 1GB buffer, then that can grow out of hand quickly.对于今天的机器来说，所有这些听起来可能没什么大不了的，但考虑到许多服务器将同时为数百个客户端/请求提供服务，如果每个服务器都需要 1GB 缓冲区，那么这可能会很快失控。

Is 1000 a good size for that buffer? 1000 是该缓冲区的好大小吗？ I'm no networking expert, but I suspect that's a little low.我不是网络专家，但我怀疑这有点低。 Maybe something on the order of 64k would be appropriate, but others can give better details here.也许大约 64k 的东西是合适的，但其他人可以在这里提供更好的细节。 Finding a good buffer size can sometimes be a bit tricky.找到一个好的缓冲区大小有时会有点棘手。