简体   繁体   中英

why is content_length in Net::HTTP.get_response sometimes nil even on good results?

I have the following ruby code (was trying to write a simple http-ping)

require 'net/http'
res1 = Net::HTTP.get_response 'www.google.com' , '/'
res2 = Net::HTTP.get_response 'www.google.com' , '/search?q=abc'

res1.code #200
res2.code #200
res1.content_length #5213
res2.content_length #nil **<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< WHY**
res2.body[0..60]
=> "<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org"

Why does res2 content_length does not show through? Is it in some other attribute of res2 (how does one see those?)

I am a newcomer at ruby. Using irb 0.9.6 on AWS Linux

Thanks a lot.

It appears that the value returned is not necessarily the length of the body, but the fixed length of the content, when that fixed length is known in advance and stored in the content-length header.

See the source for the implementation of HTTPHeader#content_length (taken from http://ruby-doc.org/stdlib-2.3.1/libdoc/net/http/rdoc/Net/HTTPHeader.html ):

# File net/http/header.rb, line 262
def content_length
  return nil unless key?('Content-Length')
  len = self['Content-Length'].slice(/\d+/) or
      raise Net::HTTPHeaderSyntaxError, 'wrong Content-Length format'
  len.to_i
end

What this probably means in this case is that the response was a multi-part MIME response, and the content-length header is not used in this case.

What you most likely want in this case is body.length , since that's the only real way to tell the actual length of the response body for a multi-part response.

Note that may be performance implications by always using content.body to find the content length; you may choose to try the content_length approach first and if it's nil, fall back to body.length .

Here's an example modification to your code:

require 'net/http'
res1 = Net::HTTP.get_response 'www.google.com' , '/'
res2 = Net::HTTP.get_response 'www.google.com' , '/search?q=abc'

res1.code #200
res2.code #200
res1.content_length #5213
res2.content_length.nil? ? res2.body.length : res2.content_length #57315  **<<<<<<<<<<<<<<< Works now **
res2.body[0..60]
=> "<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org"

or, better yet, capture the content_length and use the captured value for comparison:

res2_content_length = res2.content_length

if res2_content_length.nil?
    res2_content_length = res2.body.length
end

Personally, I'd just stick with always checking body.length and deal with any potential performance issue if and when it arises.

This should reliably retrieve the actual length of the content for you, regardless of whether you received a simple response of a multi-part response.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM