在 Python 中 HTTP GET 的最快方法是什么？

Question

如果我知道內容將是一個字符串，那么在 Python 中 HTTP GET 的最快方法是什么？ 我正在搜索文檔中的快速單行文件，例如：

contents = url.get("http://example.com/foo/bar")

但是我可以使用 Google 找到的只有httplib和urllib - 我無法在這些庫中找到快捷方式。

標准 Python 2.5 是否具有上述某種形式的快捷方式，還是應該編寫函數url_get ？

我不想捕獲炮擊的輸出wget或curl 。

Answer 1

蟒蛇3：

import urllib.request
contents = urllib.request.urlopen("http://example.com/foo/bar").read()

蟒蛇2：

import urllib2
contents = urllib2.urlopen("http://example.com/foo/bar").read()

urllib.request和read文檔。

Answer 2

您可以使用名為requests的庫。

import requests
r = requests.get("http://example.com/foo/bar")

這很容易。 然后你可以這樣做：

>>> print(r.status_code)
>>> print(r.headers)
>>> print(r.content)

Answer 3

如果您希望使用 httplib2 的解決方案成為 oneliner，請考慮實例化匿名 Http 對象

import httplib2
resp, content = httplib2.Http().request("http://example.com/foo/bar")

Answer 4

看看httplib2 ，它 - 除了許多非常有用的功能 - 提供了你想要的。

import httplib2

resp, content = httplib2.Http().request("http://example.com/foo/bar")

其中 content 將是響應正文（作為字符串），而 resp 將包含狀態和響應標頭。

雖然它不包含在標准 python 安裝中（但它只需要標准 python），但絕對值得一試。

Answer 5

使用強大的urllib3庫就足夠簡單了。

像這樣導入：

import urllib3

http = urllib3.PoolManager()

並提出這樣的請求：

response = http.request('GET', 'https://example.com')

print(response.data) # Raw data.
print(response.data.decode('utf-8')) # Text.
print(response.status) # Status code.
print(response.headers['Content-Type']) # Content type.

您也可以添加標題：

response = http.request('GET', 'https://example.com', headers={
    'key1': 'value1',
    'key2': 'value2'
})

更多信息可以在urllib3 文檔中找到。

urllib3比內置的urllib.request或http模塊更安全、更容易使用並且穩定。

Answer 6

無需進一步必要的導入，此解決方案（對我而言）有效 - 也適用於 https：

try:
    import urllib2 as urlreq # Python 2.x
except:
    import urllib.request as urlreq # Python 3.x
req = urlreq.Request("http://example.com/foo/bar")
req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36')
urlreq.urlopen(req).read()

在標題信息中未指定“用戶代理”時，我經常難以抓取內容。 然后通常會使用以下內容取消請求： urllib2.HTTPError: HTTP Error 403: Forbidden或urllib.error.HTTPError: HTTP Error 403: Forbidden 。

Answer 7

theller 的 wget 解決方案非常有用，但是，我發現它不會在整個下載過程中打印出進度。 如果在 reporthook 中的打印語句后添加一行就完美了。

import sys, urllib

def reporthook(a, b, c):
    print "% 3.1f%% of %d bytes\r" % (min(100, float(a * b) / c * 100), c),
    sys.stdout.flush()
for url in sys.argv[1:]:
    i = url.rfind("/")
    file = url[i+1:]
    print url, "->", file
    urllib.urlretrieve(url, file, reporthook)
print

Answer 8

如何也發送標題

蟒蛇3：

import urllib.request
contents = urllib.request.urlopen(urllib.request.Request(
    "https://api.github.com/repos/cirosantilli/linux-kernel-module-cheat/releases/latest",
    headers={"Accept" : 'application/vnd.github.full+json"text/html'}
)).read()
print(contents)

蟒蛇2：

import urllib2
contents = urllib2.urlopen(urllib2.Request(
    "https://api.github.com",
    headers={"Accept" : 'application/vnd.github.full+json"text/html'}
)).read()
print(contents)

Answer 9

這是一個用 Python 編寫的 wget 腳本：

# From python cookbook, 2nd edition, page 487
import sys, urllib

def reporthook(a, b, c):
    print "% 3.1f%% of %d bytes\r" % (min(100, float(a * b) / c * 100), c),
for url in sys.argv[1:]:
    i = url.rfind("/")
    file = url[i+1:]
    print url, "->", file
    urllib.urlretrieve(url, file, reporthook)
print

Answer 10

實際上，在 Python 中，我們可以像從文件一樣讀取 HTTP 響應，這里有一個從 API 讀取 JSON 的示例。

import json
from urllib.request import urlopen

with urlopen(url) as f:
    resp = json.load(f)

return resp['some_key']

Answer 11

如果您專門使用 HTTP API，還有更方便的選擇，例如Nap 。

例如，以下是從2014年5 月 1 日起從 Github 獲取要點的方法：

from nap.url import Url
api = Url('https://api.github.com')

gists = api.join('gists')
response = gists.get(params={'since': '2014-05-01T00:00:00Z'})
print(response.json())

更多示例： https : //github.com/kimmobrunfeldt/nap#examples

Answer 12

優解軒，塞勒。

為了它與python 3一起使用，請進行以下更改

import sys, urllib.request

def reporthook(a, b, c):
    print ("% 3.1f%% of %d bytes\r" % (min(100, float(a * b) / c * 100), c))
    sys.stdout.flush()
for url in sys.argv[1:]:
    i = url.rfind("/")
    file = url[i+1:]
    print (url, "->", file)
    urllib.request.urlretrieve(url, file, reporthook)
print

此外，您輸入的 URL 應以“http://”開頭，否則返回未知 url 類型錯誤。

Answer 13

對於python >= 3.6 ，您可以使用dload ：

import dload
t = dload.text(url)

對於json ：

j = dload.json(url)

安裝：
pip install dload

Answer 14

如果你想要一個較低級別的 API：

import http.client

conn = http.client.HTTPSConnection('example.com')
conn.request('GET', '/')

resp = conn.getresponse()
content = resp.read()

conn.close()

text = content.decode('utf-8')

print(text)

在 Python 中 HTTP GET 的最快方法是什么？

問題描述

14 個解決方案

解決方案1
931 已采納 2009-03-14 03:48:24

解決方案2
443

解決方案3
29 2009-03-14 16:40:06

解決方案4
19

解決方案5
10 2019-02-24 21:18:23

解決方案6
6 2018-01-01 15:11:48

解決方案7
5 2010-01-05 01:21:33

解決方案8
4 2018-09-16 06:22:04

解決方案9
4 2009-03-14 16:47:32

解決方案10
3 2019-12-10 12:13:36

解決方案11
2 2014-05-22 17:08:22

解決方案12
2 2015-06-24 14:18:05

解決方案13
1 2020-02-29 23:02:00

解決方案14
1 2020-03-06 16:26:21

在 Python 中 HTTP GET 的最快方法是什么？

問題描述

14 個解決方案

解決方案1 931 已采納 2009-03-14 03:48:24

解決方案2 443

解決方案3 29 2009-03-14 16:40:06

解決方案4 19

解決方案5 10 2019-02-24 21:18:23

解決方案6 6 2018-01-01 15:11:48

解決方案7 5 2010-01-05 01:21:33

解決方案8 4 2018-09-16 06:22:04

解決方案9 4 2009-03-14 16:47:32

解決方案10 3 2019-12-10 12:13:36

解決方案11 2 2014-05-22 17:08:22

解決方案12 2 2015-06-24 14:18:05

解決方案13 1 2020-02-29 23:02:00

解決方案14 1 2020-03-06 16:26:21

解決方案1
931 已采納 2009-03-14 03:48:24

解決方案2
443

解決方案3
29 2009-03-14 16:40:06

解決方案4
19

解決方案5
10 2019-02-24 21:18:23

解決方案6
6 2018-01-01 15:11:48

解決方案7
5 2010-01-05 01:21:33

解決方案8
4 2018-09-16 06:22:04

解決方案9
4 2009-03-14 16:47:32

解決方案10
3 2019-12-10 12:13:36

解決方案11
2 2014-05-22 17:08:22

解決方案12
2 2015-06-24 14:18:05

解決方案13
1 2020-02-29 23:02:00

解決方案14
1 2020-03-06 16:26:21