为什么有些服务器在最后一个块长度为零后不使用 CRLF？

Question

I'm working with an HTTP request tool (similar to cURL) and having an issue with the server response.我正在使用 HTTP 请求工具（类似于 cURL）并且遇到服务器响应问题。 Either that or my understanding of the RFC for HTTP 1.1 and chunked data.或者我对 HTTP 1.1 和分块数据的 RFC 的理解。

What I'm seeing is chunked data should be in this format:我所看到的是分块数据应采用以下格式：

4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
 in\r\n\r\nchunks.\r\n
0\r\n
\r\n

what I'm actually seeing is the following:我实际看到的是以下内容：

4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
 in\r\n\r\nchunks.\r\n
0

In other words, the few servers I've tested with send no more data after the 0.. not CRLF, much less CRLFCRLF.换句话说，我测试过的少数服务器在 0.. 之后不再发送数据。不是 CRLF，更不用说 CRLFCRLF。

How are we supposed to know it's the end of the chunked data without the proper format of the chunked tags?如果没有正确的分块标签格式，我们怎么知道这是分块数据的结尾？ Timeouts happen looking for the CRLFs after the 0, and that's no sufficient.超时发生在 0 之后寻找 CRLF，这还不够。

Answer 1

Yes, it violates standard.是的，它违反了标准。 But we want to be compatible with all possible http servers and clients, so we have to understand a way how it can be violated.但是我们希望与所有可能的 http 服务器和客户端兼容，因此我们必须了解如何违反它。

Chunked is used often in a way of content streaming over http 1.1 protocol.分块通常用于通过 http 1.1 协议进行内容流传输。 Standard ask to end content with additional CRLF .标准要求以额外的CRLF结束内容。 So we can see the following pseudo code:所以我们可以看到如下伪代码：

def stream(endpoint)
  Socket.open(endpoint) do |socket|
    sleep 10

    more_data do |data|
      print data.length.to_s(16)
      print data
      print "CRLF"
    end
  end

  print "CRLF"
end

But the right code is the following:但正确的代码如下：

def stream(endpoint)
  Socket.open(endpoint) do |socket|
    sleep 10

    more_data do |data|
      print data.length.to_s(16)
      print data
      print "CRLF"
    end
  end

ensure
  print "CRLF"
end

It means that after input socket interruption of any other exception wrong version of method won't be able to print additional "CRLF" to output socket.这意味着在输入套接字中断后任何其他异常错误版本的方法将无法打印额外的“CRLF”到输出套接字。

How are we supposed to know it's the end of the chunked data without the proper format of the chunked tags?如果没有正确的分块标签格式，我们怎么知道这是分块数据的结尾？ Timeouts happen looking for the CRLFs after the 0, and that's no sufficient.超时发生在 0 之后寻找 CRLF，这还不够。

Many implementations ignores this violation because they don't need to know the size of content.许多实现忽略了这种违规，因为它们不需要知道内容的大小。 They just tries to receive as much data as possible before socket will be closed.他们只是尝试在套接字关闭之前接收尽可能多的数据。

Answer 2

Use Content-Length, definitely whenever I know it;使用 Content-Length，绝对是我知道的时候； for file download, checking the filesize is insignificant in terms of resources.对于文件下载，检查文件大小在资源方面无关紧要。 For chunked transfer we do not scan the message body for a CRLF pair.对于分块传输，我们不扫描消息正文中的 CRLF 对。 It first reads the specified number of bytes, and then reads two more bytes to confirm that they are CR and LF.它首先读取指定数量的字节，然后再读取两个字节以确认它们是 CR 和 LF。 If they're not, the message body is ill-formed, and either the size was specified improperly or the data was otherwise corrupted.如果不是，则消息正文格式错误，并且大小指定不正确或数据已损坏。

For more information read RCF , which says有关更多信息，请阅读RCF ，其中说

A server using chunked transfer-coding in a response MUST NOT use the trailer for any header fields unless at least one of the following is true:在响应中使用分块传输编码的服务器不得对任何标头字段使用尾部，除非至少满足以下一项：

a)the request included a TE header field that indicates "trailers" is acceptable in the transfer-coding of the response, as described in section 14.39; a) 请求包含一个 TE 头字段，表明在响应的传输编码中“trailers”是可接受的，如第 14.39 节所述； or,或者，

b)the server is the origin server for the response, the trailer fields consist entirely of optional metadata, and the recipient could use the message (in a manner acceptable to the origin server) without receiving this metadata. b) 服务器是响应的源服务器，尾部字段完全由可选的元数据组成，接收者可以使用消息（以源服务器可接受的方式）而无需接收此元数据。 In other words, the origin server is willing to accept the possibility that the trailer fields might be silently discarded along the path to the client.换句话说，源服务器愿意接受这样的可能性，即拖车字段可能会在通往客户端的路径上被悄悄丢弃。

Way to Determine Message Body Length:确定消息体长度的方法：

If header has Transfer-Encoding and the chunked transfer is final encoding, then message body length is determined by reading and decoding the chunked data until the transfer coding indicates the data is complete.如果报头有传输编码并且分块传输是最终编码，则消息体长度通过读取和解码分块数据来确定，直到传输编码指示数据完成。

If header has Transfer-Encoding and the chunked transfer is not final encoding, then message body length is determined by reading the connection until it is closed by the server.如果 header 有 Transfer-Encoding 并且分块传输不是最终编码，则消息正文长度通过读取连接来确定，直到它被服务器关闭。

If header has Transfer-Encoding in request and the chunked transfer is not final encoding, then message body length cannot be determined reliably;如果请求头中有传输编码，并且分块传输不是最终编码，则无法可靠地确定消息体长度； the server MUST respond with the 400 (Bad Request) status code and then close the connection.服务器必须以 400（错误请求）状态代码响应，然后关闭连接。

If a message is received with both a Transfer-Encoding and Content-Length header field, the Transfer-Encoding overrides the Content-Length.如果接收到的消息同时带有 Transfer-Encoding 和 Content-Length 头域，Transfer-Encoding 会覆盖 Content-Length。 Such a message might indicate an attempt to perform request response splitting and ought to be handled as an error.这样的消息可能表示尝试执行请求响应拆分，应该作为错误处理。 A sender MUST remove the received Content-Length field prior to forwarding such a message downstream.在向下游转发此类消息之前，发送方必须删除接收到的 Content-Length 字段。

为什么有些服务器在最后一个块长度为零后不使用 CRLF？

问题描述

2 个解决方案

解决方案1
1 2020-05-11 14:23:13

解决方案2
-1 2015-11-23 18:59:28

Way to Determine Message Body Length:确定消息体长度的方法：

为什么有些服务器在最后一个块长度为零后不使用 CRLF？

问题描述

2 个解决方案

解决方案1 1 2020-05-11 14:23:13

解决方案2 -1 2015-11-23 18:59:28

Way to Determine Message Body Length:确定消息体长度的方法：

解决方案1
1 2020-05-11 14:23:13

解决方案2
-1 2015-11-23 18:59:28