简体   繁体   中英

Reading wrong data from TCP socket

I'm trying to send data blockwise over a TCP socket. The server code does the following:

#define CHECK(n) if((r=n) <= 0) { perror("Socket error\n"); exit(-1); }
int r;

//send the number of blocks
CHECK(write(sockfd, &(storage->length), 8)); //p->length is uint64_t

for(p=storage->first; p!=NULL; p=p->next) {
  //send the size of this block
  CHECK(write(sockfd, &(p->blocksize), 8)); //p->blocksize is uint64_t

  //send data
  CHECK(write(sockfd, &(p->data), p->blocksize));
}

On the client side, I read the size and then the data (same CHECK makro):

CHECK(read(sockfd, &block_count, 8));
for(i=0; i<block_count; i++) {
  uint64_t block_size;
  CHECK(read(sockfd, &block_size, 8));

  uint64_t read_in=0;
  while(read_in < block_size) {
    r = read(sockfd, data+read_in, block_size-read_in); //assume data was previously allocated as char*
    read_in += r;
  }
}

This works perfectly fine as long as both client and server run on the same machine, but as soon as I try this over the network, it fails at some point. In particular, the first 300-400 blocks (à ~587 bytes) or so work fine, but then I get an incorrect block_size reading:

received block #372 size : 586
read_in: 586 of 586
received block #373 size : 2526107515908

And then it crashes, obviously. I was under the impression that the TCP protocol ensures no data is lost and everything is received in correct order, but then how is this possible and what's my mistake here, considering that it already works locally?

不能保证当您读取block_countblock_size ,您将一次性读取所有8个字节。

I was under the impression that the TCP protocol ensures no data is lost and everything is received in correct order

Yes, but that's all that TCP guarantees. It does not guarantee that the data is sent and received in a single packet. You need to gather the data and piece them together in a buffer until you get the block size you want before copying the data out.

Perhaps the read calls are returning without reading the full 8 bytes. I'd check what length they report they've read.

You might also find valgrind or strace informative for better understanding why your code is behaving this way. If you're getting short reads, strace will tell you what the syscalls returned, and valgrind will tell you that you're reading uninitialized bytes in your length variables.

The reason why it works on the same machine is that the block_size and block_count are sent as binary values and when they are received and interpreted by the client, they have same values.

However, if two machines communicating have different byte order for representing integers, eg x86 versus SPARC, or sizeof(int) is different, eg 64 bit versus 32 bit, then the code will not work correctly.

You need to verify that sizeof(int) and byte order of both machines is identical. On the server side, print out sizeof(int) and values of storage->length and p->blocksize. On the client side print out sizeof(int) and values of block_count and block_size.

When it doesn't work correctly, I think you will find them that they are not the same. If this is true, then the contents of data is also going to be misinterpreted if it contains any binary data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM