简体   繁体   中英

Twisted: Using connectProtocol to connect endpoint cause memory leak?

I was trying to build a server. Beside accept connection from clients as normal servers do, my server will connect other server as a client either.

I've set the protocol and endpoint like below:

p = FooProtocol()
client = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # without ClientFactory

Then, after call reactor.run() , the server will listen/accept new socket connections. when new socket connections are made(in connectionMade), the server will call connectProtocol(client, p) , which acts like the pseudocode below:

while server accept new socket:
    connectProtocol(client, p)
    # client.client.connect(foo_client_factory)    --> connecting in this way won't
    #                                                  cause memory leak

As the connections to the client are made, the memory is gradually consumed(explicitly calling gc doesn't work).

Do I use the Twisted in a wrong way?

-----UPDATE-----

My test programe: Server waits clients to connect. When connection from client is made, server will create 50 connections to other server

Here is the code:

#! /usr/bin/env python

import sys
import gc

from twisted.internet import protocol, reactor, defer, endpoints
from twisted.internet.endpoints import TCP4ClientEndpoint, connectProtocol

class MyClientProtocol(protocol.Protocol):
    def connectionMade(self):
        self.transport.loseConnection()

class MyClientFactory(protocol.ClientFactory):
    def buildProtocol(self, addr):
        p = MyClientProtocol()
        return p

class ServerFactory(protocol.Factory):
    def buildProtocol(self, addr):
        p = ServerProtocol()
        return p

client_factory = MyClientFactory() # global
client_endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # global

times = 0

class ServerProtocol(protocol.Protocol):
    def connectionMade(self):
        global client_factory
        global client_endpoint
        global times

        for i in range(50):
            # 1)
            p = MyClientProtocol()
            connectProtocol(client_endpoint, p) # cause memleak

            # 2)
            #client_endpoint.connect(client_factory) # no memleak

        times += 1
        if times % 10 == 9:
            print 'gc'
            gc.collect() # doesn't work

        self.transport.loseConnection()

if __name__ == '__main__':
    server_factory = ServerFactory()
    serverEndpoint = endpoints.serverFromString(reactor, "tcp:8888")
    serverEndpoint.listen(server_factory)
    reactor.run()

This program doesn't do any Twisted log initialization. This means it runs with the "log beginner" for its entire run. The log beginner records all log events it observes in a LimitedHistoryLogObserver (up to a configurable maximum).

The log beginner keeps 2 ** 16 ( _DEFAULT_BUFFER_MAXIMUM ) events and then begins throwing out old ones, presumably to avoid consuming all available memory if a program never configures another observer.

If you hack the Twisted source to set _DEFAULT_BUFFER_MAXIMUM to a smaller value - eg, 10 - then the program no longer "leaks". Of course, it's really just an object leak and not a memory leak and it's bounded by the 2 ** 16 limit Twisted imposes.

However, connectProtocol creates a new factory each time it is called. When each new factory is created, it logs a message. And the application code generates a new Logger for each log message. And the logging code puts the new Logger into the log message. This means the memory cost of keeping those log messages around is quite noticable (compared to just leaking a short blob of text or even a dict with a few simple objects in it).

I'd say the code in Twisted is behaving just as intended ... but perhaps someone didn't think through the consequences of that behavior complete.

And, of course, if you configure your own log observer then the "log beginner" is taken out of the picture and there is no problem. It does seem reasonable to expect that all serious programs will enable logging rather quickly and avoid this issue. However, lots of short throw-away or example programs often don't ever initialize logging and rely on print instead, making them subject to this behavior.

Note This problem was reported in #8164 and fixed in 4acde626 so Twisted 17 will not have this behavior.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM