使用Python从单个请求获取HTML和标头

Question

I'm investigating the possibility of making a single http request using python to retrieve both the html as well as http headers info instead of having to make 2 seperate calls. 我正在研究使用python发出单个http请求以检索html和http标头信息的可能性，而不必进行2个单独的调用。

Anyone know of any good ways? 有人知道什么好方法吗？

Also what is the performance differences between the different methods of making these requests, eg urllib2 and httpconnection, etc. 另外，发出这些请求的不同方法（例如urllib2和httpconnection等）之间的性能差异是什么？

Answer 1

Just use urllib2.urlopen() . 只需使用urllib2.urlopen() 。 The HTML can be retrieved by calling the read() method of the returned object, and the headers are available in the headers attribute. 可以通过调用返回对象的read()方法来检索HTML，并且标头在headers属性中可用。

import urllib2
f = urllib2.urlopen('http://www.google.com')

>>> print f.headers
Date: Fri, 08 Jun 2012 12:57:25 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Connection: close

>>> print f.read()
<!doctype html><html itemscope itemtype="http://schema.org/WebPage"><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
... etc ...

Answer 2

如果使用HTTPResponse ，则可以通过两个函数调用头和内容，但是不会两次访问服务器。

使用Python从单个请求获取HTML和标头

问题描述

2 个解决方案

解决方案1
3 已采纳 2012-06-08 13:05:30

解决方案2
1 2012-06-08 12:32:40

使用Python从单个请求获取HTML和标头

问题描述

2 个解决方案

解决方案1 3 已采纳 2012-06-08 13:05:30

解决方案2 1 2012-06-08 12:32:40

解决方案1
3 已采纳 2012-06-08 13:05:30

解决方案2
1 2012-06-08 12:32:40