簡體   English   中英

連接斷開時龍卷風內存泄漏

[英]Tornado memory leak on dropped connections

我有一個設置,其中將龍卷風用作工人的通行證。 Tornado接收到請求,該請求將請求發送給N個工作人員,匯總結果並將其發送回客戶端。 一切正常,除非出於某種原因發生超時-然后我出現了內存泄漏。

我有一個類似於以下偽代碼的設置:

workers = ["http://worker1.example.com:1234/",
           "http://worker2.example.com:1234/", 
           "http://worker3.example.com:1234/" ...]

class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        responses = []

        def __callback(response):
            responses.append(response)
            if len(responses) == len(workers):
                self._finish_req(responses)

        for url in workers:
            async_client = tornado.httpclient.AsyncHTTPClient()
            request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
            async_client.fetch(request, __callback) 

    def _finish_req(self, responses):
        good_responses = [r for r in responses if not r.error]
        if not good_responses:
            raise tornado.web.HTTPError(500, "\n".join(str(r.error) for r in responses))
        results = aggregate_results(good_responses)
        self.set_header("Content-Type", "application/json")
        self.write(json.dumps(results))
        self.finish()

application = tornado.web.Application([
    (r"/", MyHandler),
])

if __name__ == "__main__":
    ##.. some locking code 
    application.listen()
    tornado.ioloop.IOLoop.instance().start()

我究竟做錯了什么? 內存泄漏來自哪里?

我不知道問題的根源,看來gc應該可以解決這個問題,但是您可以嘗試兩種方法。

第一種方法是簡化一些引用(當RequestHandler完成時,似乎仍然有對responses引用):

class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        self.responses = []

        for url in workers:
            async_client = tornado.httpclient.AsyncHTTPClient()
            request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
            async_client.fetch(request, self._handle_worker_response) 

    def _handle_worker_response(self, response):
        self.responses.append(response)
        if len(self.responses) == len(workers):
            self._finish_req()

    def _finish_req(self):
        ....

如果這不起作用,則始終可以手動調用垃圾回收:

import gc
class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        ....

    def _finish_req(self):
        ....

    def on_connection_close(self):
        gc.collect()

代碼看起來不錯。 泄漏可能在龍卷風內部。

我只是偶然發現了這一行:

async_client = tornado.httpclient.AsyncHTTPClient()

您是否知道此構造函數中的實例化魔術? 從文檔:

"""
The constructor for this class is magic in several respects:  It actually
creates an instance of an implementation-specific subclass, and instances
are reused as a kind of pseudo-singleton (one per IOLoop).  The keyword
argument force_instance=True can be used to suppress this singleton
behavior.  Constructor arguments other than io_loop and force_instance
are deprecated.  The implementation subclass as well as arguments to
its constructor can be set with the static method configure()
"""

因此,實際上,您不需要在循環中執行此操作。 (另一方面,它應該不會造成任何傷害。)但是,您使用CurlAsyncHTTPClient還是SimpleAsyncHTTPClient是哪個實現?

如果是SimpleAsyncHTTPClient,請注意以下代碼中的注釋:

"""
This class has not been tested extensively in production and
should be considered somewhat experimental as of the release of
tornado 1.2. 
"""

您可以嘗試切換到CurlAsyncHTTPClient。 或按照Nikolay Fominyh的建議,並跟蹤對__callback()的調用。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM