简体   繁体   中英

Why Should I Set a Conservative Max Byte Size on a Socket's 'recv' Method?

I am building a client using Python's socket.socket class that receives data which varies in size (usually between 500 and 5,000 bytes but it is theoretically possible for the client socket to receive 500,000 bytes). I am also writing the server that will communicate with this client socket.

I am curious to know, what is the risk of setting a maximum byte size that I can be confident I will never exceed such as:

socket.recv(1000000)

even though I know this is far larger than 99% of the sockets actual usage.

All you're doing is wasting memory on an epic scale.

  1. If you're reading at maximum speed, you will never get more than the path MTU, which is usually under 1500 bytes, and certainly measured in kilobytes, not megabytes.

  2. If you're not reading at maximum speed, there is already a socket receive buffer inside the kernel, which is sized somewhere in the range 8-64k depending on your platform, and by the operation of TCP it is entirely impossible for recv() to ever deliver more data than is in that buffer.

Sockets don't work the way you think they do. socket.recv(N) does not mean you will get back N bytes. It means you will get back at most N bytes. This is regardless of how many bytes the sender tried to send you. TCP is stream oriented . This means you will get the bytes that the sender sent you, in the order they sent them. But you will not get the same "message" boundaries that they used when sending the data.

You have to write your code to be able to call recv multiple times because for all you know, socket.recv(1000000) will return one byte to you. And now as long as you're calling it multiple times, you don't have to think about the size of the argument as compared to the size of the messages you're receiving. As other posters have said, you want to pass a value that's comparable to the size of the largest buffer at some other level of the stack. One of those buffers (path MTU) is probably around 1500 (but it can be larger or smaller). But the local receive buffer in your kernel's TCP/IP stack is larger, probably around 64k or 128k. Those are probably close to reasonable values to use.

Though, I recommend not actually writing network code at this level. It's been done - more or less to death. You'd probably be a whole lot better off focusing on the novel part of your application and re-using some existing library that deals with these details for you. I recommend Twisted .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM