Python 中的異步 HTTP 調用

Question

我需要在 Python 中使用回調類型的功能，我多次向 Web 服務發送請求，每次都更改參數。 我希望這些請求同時發生而不是順序發生，因此我希望異步調用該函數。

看起來 asyncore 是我可能想要使用的，但我所看到的關於它如何工作的例子看起來都有些矯枉過正，所以我想知道是否還有另一條路我應該走下去。 關於模塊/流程的任何建議？ 理想情況下，我想以程序方式使用這些而不是創建類，但我可能無法解決這個問題。

Answer 1

從 Python 3.2 開始，您可以使用concurrent.futures來啟動並行任務。

查看這個ThreadPoolExecutor示例：

http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example

它產生線程來檢索 HTML 並在收到響應時對響應進行操作。

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

上面的例子使用線程。 還有一個類似的ProcessPoolExecutor使用進程池，而不是線程：

http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

Answer 2

你知道eventlet嗎？ 它允許您編寫看似同步的代碼，但讓它在網絡上異步運行。

下面是一個超小型爬蟲的例子：

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)

Answer 3

Twisted 框架就是這樣做的門票。 但是如果你不想接受它，你也可以使用pycurl ，它是 libcurl 的包裝器，它有自己的異步事件循環並支持回調。

Answer 4

（雖然這個線程是關於服務器端 Python 的。因為這個問題是前一段時間被問到的。其他人可能會偶然發現他們在客戶端尋找類似的答案）

對於客戶端解決方案，您可能需要查看 Async.js 庫，尤其是“控制流”部分。

https://github.com/caolan/async#control-flow

通過將“平行”與“瀑布”相結合，您可以獲得您想要的結果。

瀑布（並行（任務A，任務B，任務C）-> PostParallelTask）

如果您檢查 Control-Flow - "Auto" 下的示例，它們會為您提供上述示例： https : //github.com/caolan/async#autotasks-callback其中“write-file”取決於“get_data”和“ make_folder”和“email_link”取決於寫入文件”。

請注意，所有這些都發生在客戶端（除非您在服務器端執行 Node.JS）

對於服務器端 Python，請查看 PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py

通過將下面的示例與 pyCurl 結合，您可以實現非阻塞多線程功能。

Python 中的異步 HTTP 調用

問題描述

4 個解決方案

解決方案1
18 2011-02-10 23:32:20

解決方案2
16 2011-02-11 18:54:03

解決方案3
8 已采納 2011-02-10 21:55:31

解決方案4
-1 2014-02-06 20:57:21

Python 中的異步 HTTP 調用

問題描述

4 個解決方案

解決方案1 18 2011-02-10 23:32:20

解決方案2 16 2011-02-11 18:54:03

解決方案3 8 已采納 2011-02-10 21:55:31

解決方案4 -1 2014-02-06 20:57:21

解決方案1
18 2011-02-10 23:32:20

解決方案2
16 2011-02-11 18:54:03

解決方案3
8 已采納 2011-02-10 21:55:31

解決方案4
-1 2014-02-06 20:57:21