简体   繁体   English

从套接字读取时出现意外行为

[英]Unexpected behavior when reading from socket

I've wrote the following function that reads http response from the server through the socket.我编写了以下 function ,它通过套接字从服务器读取 http 响应。 I had no problems reading text pages like this page but when I try to read images:我在阅读此页面之类的文本页面时没有问题,但是当我尝试阅读图像时:

图片

the reading goes on without adding data to the buffer, even though the read returns the correct byte amount.即使读取返回正确的字节量,读取也会继续进行而不向缓冲区添加数据。

The function: function:

unsigned char *read_unknown_size(int fd) {
    int available_buf_size = 1000, tot_read = 0, curr_read_size;
    unsigned char *buf = calloc(available_buf_size, 1), *tmp_ptr;
    if (buf) {
        while ((curr_read_size = (int) read(fd, buf + tot_read, available_buf_size - tot_read)) != 0) {
            if (curr_read_size == -1) {
                perror("failed to read\n");
                //todo free mem
                exit(EXIT_FAILURE);
            } else {
                tot_read += curr_read_size;
                if (tot_read >= available_buf_size) { //the buffer is full
                    available_buf_size *= 2;
                    tmp_ptr = realloc(buf, available_buf_size + tot_read);
                    if (tmp_ptr) {
                        buf = tmp_ptr;
                        memset(buf+tot_read, 0, available_buf_size - tot_read);
                    }
                    else {
                        fprintf(stderr,"realloc failed\n");
                        exit(EXIT_FAILURE);
                    }
                }
            }
        }
    } else {
        fprintf(stderr,"calloc failed\n");
        exit(EXIT_FAILURE);
    }
    return buf;
}

The buffer after one reading of size 1000:一次读取大小 1000 后的缓冲区:

0x563a819da130 "HTTP/1.1 200 OK\r\nDate: Tue, 23 Nov 2021 19:32:01 GMT\r\nServer: Apache\r\nUpgrade: h2,h2c\r\nConnection: Upgrade, close\r\nLast-Modified: Sat, 11 Jan 2014 01:32:55 GMT\r\nAccept-Ranges: bytes\r\nContent-Length: 3900\r\nCache-Control: max-age=2592000\r\nExpires: Thu, 23 Dec 2021 19:32:01 GMT\r\nContent-Type: image/jpeg\r\n\r\nGIF89", <incomplete sequence \375> 0x563a819da130 "HTTP/1.1 200 OK\r\n日期:2021 年 11 月 23 日星期二 19:32:01 GMT\r\n服务器:Apache\r\n升级:h2,h2c\r\n连接:升级,关闭\r\nLast-已修改:2014 年 1 月 11 日星期六 01:32:55 GMT\r\n接受范围:字节\r\n内容长度:3900\r\n缓存控制:max-age=2592000\r\n到期:12 月 23 日星期四2021 年 19:32:01 GMT\r\nContent-Type: image/jpeg\r\n\r\nGIF89", <不完整序列 \375>

A total of 379 character.共379个字。

Edit: After reading the data, I'm writing it to a new file, the text pages works fine but I can't open images.编辑:读取数据后,我将其写入一个新文件,文本页面工作正常,但我无法打开图像。

I believe the program is working, but you are simply printing out the buffer up until the first NUL character by using printf("%s", buf) or similar.我相信该程序正在运行,但是您只是使用printf("%s", buf)或类似方法打印出缓冲区,直到第一个 NUL 字符。 Keep in mind the 6th character of GIF file is a NUL.请记住,GIF 文件的第 6 个字符是 NUL。

The problem is that the caller can't do anything useful because it has no way to know how much data is in the returned buffer.问题是调用者不能做任何有用的事情,因为它无法知道返回的缓冲区中有多少数据。 So, in order to do anything useful with the result of the function, it needs to return not just the buffer but the number of characters it read.因此,为了对 function 的结果做任何有用的事情,它不仅需要返回缓冲区,还需要返回它读取的字符数。

// Reads until EOF is encountered.
// Returns 0 on success.
// Returns -1 and sets errno on error.
int read_rest(int fd, unsigned char **buf_ptr, size_t *total_read_ptr) {
   unsigned char *buf        = NULL;
   size_t         buf_size   = 0;
   size_t         total_read = 0;

   while (1) {
      if ( total_read == buf_size ) {
         size_t new_size = buf_size * 2;  // Refine this.
         unsigned char *tmp = realloc(buf, new_size);
         if (!tmp)
            goto ERROR;

         buf      = tmp;
         buf_size = new_size;
      }

      ssize_t chunk_size = read(fd, buf + total_read, buf_size - total_read);
      if ( chunk_size < 0 )
         goto ERROR;

      if ( chunk_size == 0 ) {
         unsigned char *tmp = realloc(buf, buf_size);
         if (tmp)
            buf = tmp;

         *buf_ptr        = buf;
         *total_read_ptr = total_read;
         return 0;
      }

      total_read += chunk_size;
   }

ERROR:
   free(buf);
   *buf_ptr        = NULL;
   *total_read_ptr = 0;
   return -1;
}

Sample caller:示例调用者:

unsigned char *buf;
size_t         size;

if ( read_rest(fd, &buf, &size) == -1 ) {
   perror("Can't read from socket");
   exit(EXIT_FAILURE);
}

Now you have enough information to print out the contents of the buffer (eg using write ).现在您有足够的信息打印出缓冲区的内容(例如使用write )。


Comments on the original code:对原代码的评论:

  • Think very hard before using casts.在使用演员表之前要三思而后行。 Using (int)read(...) makes no sense.使用(int)read(...)没有任何意义。 This is incorrect.这是不正确的。
  • It's best to include the actual error (as perror does) when an error occurs.发生错误时最好包括实际错误(如perror所做的那样)。
  • Printing out error messages is best done outside of the I/O function.最好在 I/O function 之外打印出错误消息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM