[英]Why do Ruby Net::HTTP.get_response and Net::HTTP.new(uri.host).request return different things?
These 2 requests should have the same result, but the first one returns a 200 (OK) and the second one returns a 404 (Not Found). 这两个请求应该具有相同的结果,但是第一个请求返回200(确定),第二个请求返回404(未找到)。 Why is that? 这是为什么?
require 'net/http'
url = "http://readwrite.com/2013/12/04/google-compute-engine"
uri = URI(url)
Net::HTTP.get_response(uri)
#=> #<Net::HTTPOK 200 OK readbody=true>
Net::HTTP.new(uri.host).request(Net::HTTP::Get.new(url))
#=> #<Net::HTTPNotFound 404 Not Found readbody=true>
It happens only with some urls. 它仅在某些网址下发生。 I couldn't figure out the pattern. 我不知道这种模式。 Here's another example: http://davidduchemin.com/2014/01/towards-mastery-again/
. 这是另一个示例: http://davidduchemin.com/2014/01/towards-mastery-again/
: http://davidduchemin.com/2014/01/towards-mastery-again/
。
First, let's compare the two by viewing their actual HTTP requests with tcpdump so we can get an idea for what may be happening: 首先,让我们通过使用tcpdump查看它们的实际HTTP请求来比较两者,以便我们可以了解可能发生的情况:
tcpdump -vvASs 0 port 80 and host www.readwrite.com
# Net::HTTP.get_response(uri) GET /2013/12/04/google-compute-engine HTTP/1.1 Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3 Accept: */* User-Agent: Ruby Host: readwrite.com
# Net::HTTP.new(uri.host).request(Net::HTTP::Get.new(url)) GET http://readwrite.com/2013/12/04/google-compute-engine HTTP/1.1 Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3 Accept: */* User-Agent: Ruby Connection: close Host: readwrite.com
We can see that the second request is incorrectly requesting the full URL (with hostname) as the path. 我们可以看到第二个请求错误地请求了完整的URL(带有主机名)作为路径。 This is because you're passing url
to Net::HTTP::Get.new
which causes Net::HTTP::Get.new(url).path
to be just what we see above: the full URL with hostname. 这是因为您将url
传递到Net::HTTP::Get.new
,这导致Net::HTTP::Get.new(url).path
就是我们上面看到的:带有主机名的完整URL。 Instead pass the URI instance ( uri
) to Net::HTTP::Get.new
: 而是将URI实例( uri
)传递给Net::HTTP::Get.new
:
Net::HTTP.new(uri.host).request(Net::HTTP::Get.new(uri))
#=> #<Net::HTTPOK 200 OK readbody=true>
And its tcpdump is now effectively the same as the first's: 现在,它的tcpdump实际上与第一个相同:
GET /2013/12/04/google-compute-engine HTTP/1.1 Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3 Accept: */* User-Agent: Ruby Host: readwrite.com Connection: close
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.