简体   繁体   English

OSError:[Errno 24]在Twisted中使用Reactor.run()时打开的文件过多

[英]OSError: [Errno 24] Too many open files when using reactor.run() in Twisted

I am having a weird issue: I am running a large amount of utils.getProcessOutputAndValue('cmd', [args]) commands and the result depends on whether I started the reactor using task.react() or reactor.run() 我有一个奇怪的问题:我正在运行大量的utils.getProcessOutputAndValue('cmd', [args])命令和结果取决于是否使用我开始反应器task.react()reactor.run()

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from progress.bar import IncrementalBar
from twisted.internet import defer
from twisted.internet import task
from twisted.internet import utils
from twisted.python import usage


class Options(usage.Options):
    optFlags = [['reactor', 'r', 'Use reactor.run().'],
                ['task', 't', 'Use task.react().'],
                ['cwr', 'w', 'Use callWhenRunning().']]
    optParameters = [['limit', 'l', 255, 'Number of file descriptors to open.'],
                     ['cmd', 'c', 'echo Testing {i}...', 'Command to run.']]


def run(opt):
    limit = int(opt['limit'])
    cmd, args = opt['cmd'].split(' ', 1)
    bar = IncrementalBar('Running {cmd}'.format(cmd=opt['cmd']), max=limit)
    requests = []
    for i in range(0, limit):
        try:
            _args = args.format(i=i)
            args = _args
        except KeyError:
            pass
        requests.append(utils.getProcessOutputAndValue('echo', [args]))
        bar.next()
    bar.finish()
    return defer.gatherResults(requests)


@defer.inlineCallbacks
def main(reactor, opt):
    d = defer.Deferred()
    limit = int(opt['limit'])
    cmd, args = opt['cmd'].split(' ', 1)
    bar = IncrementalBar('Running {cmd}'.format(cmd=opt['cmd']), max=limit)
    for i in range(0, limit):
        try:
            _args = args.format(i=i)
            args = _args
        except KeyError:
            pass
        yield utils.getProcessOutputAndValue('echo', [args])
        bar.next()
    bar.finish()
    defer.returnValue(d.callback(True))


if __name__ == '__main__':
    opt = Options()
    opt.parseOptions()

    if opt['reactor']:
        from twisted.internet import reactor
        task.deferLater(reactor, 0, run, opt)
        reactor.run()

    elif opt['task']:
        from twisted.internet.task import react
        react(main, [opt])

    elif opt['cwr']:
        from twisted.internet import reactor
        reactor.callWhenRunning(run, opt)
        reactor.run()

When using limit above 400 (in my case) I get the following error: 当使用超过400的limit (在我的情况下)时,出现以下错误:

Upon execvpe echo ['echo', 'Testing 0...'] in environment id 42131264
:Traceback (most recent call last):
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 428, in _fork
    self._setupChild(**kwargs)
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 803, in _setupChild
    for fd in _listOpenFDs():
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 638, in _listOpenFDs
    return detector._listOpenFDs()
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 553, in _listOpenFDs
    self._listOpenFDs = self._getImplementation()
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 576, in _getImplementation
    after = impl()
  File "/home/vagrant/.env/sm/lib/python2.7/site-packages/Twisted-15.5.0-py2.7-linux-x86_64.egg/twisted/internet/process.py", line 606, in _procFDImplementation
    return [int(fd) for fd in self.listdir(dname)]
OSError: [Errno 24] Too many open files: '/proc/23421/fd'
Unhandled error in Deferred:

Which does not occur if I am using task.react() 如果我使用task.react() 不会发生

In resume: 在简历中:

  • python pyerr.py -l100 -r : OK python pyerr.py -l100 -r确定
  • python pyerr.py -l100 -t : OK python pyerr.py -l100 -t确定
  • python pyerr.py -l100 -w : OK python pyerr.py -l100 -w确定
  • python pyerr.py -l400 -r : OSERR python pyerr.py -l400 -rOSERR
  • python pyerr.py -l400 -t : OK python pyerr.py -l400 -t确定
  • python pyerr.py -l400 -w : OSERR python pyerr.py -l400 -wOSERR

The problem is that I have a big application that uses reactor, because its an application responding to SMTP connections (so cannot use task.react , I do not want to stop the reactor). 问题是我有一个使用反应器的大型应用程序,因为它的应用程序响应SMTP连接(因此无法使用task.react ,我不想停止反应器)。

I always thought that task.react was only stopping the reactor once the deferred is done, but I guess is doing more than this... 我一直以为task.react仅在完成延迟后才停止反应堆,但我想除了此事外...


edit : Here a pstree comparaison for task.react vs reactor.run 编辑 :在这里,一个pstree comparaison为task.react VS reactor.run

reactor.run (python pyerr.py -l400 -r) : 反应器运行(python pyerr.py -l400 -r)

init-+-VBoxService---7*[{VBoxService}]
     |-acpid
     |-atd
     |-cron
     |-dbus-daemon
     |-dhclient
     |-6*[getty]
     |-master-+-pickup
     |        `-qmgr
     |-mysqld---18*[{mysqld}]
     |-nginx---4*[nginx]
     |-php5-fpm---2*[php5-fpm]
     |-puppet---{puppet}
     |-rpc.idmapd
     |-rpc.statd
     |-rpcbind
     |-rsyslogd---3*[{rsyslogd}]
     |-ruby---{ruby}
     |-sshd-+-3*[sshd---sshd---sftp-server]
     |      |-sshd---sshd---2*[sftp-server]
     |      |-sshd---sshd---bash---pstree
     |      `-sshd---sshd---bash---python-+-323*[echo]
     |                                    `-5*[python]
     |-systemd-logind
     |-systemd-udevd
     |-upstart-file-br
     |-upstart-socket-
     `-upstart-udev-br

task.react (python pyerr.py -l400 -t) : task.react(python pyerr.py -l400 -t)

init-+-VBoxService---7*[{VBoxService}]
     |-acpid
     |-atd
     |-cron
     |-dbus-daemon
     |-dhclient
     |-6*[getty]
     |-master-+-pickup
     |        `-qmgr
     |-mysqld---18*[{mysqld}]
     |-nginx---4*[nginx]
     |-php5-fpm---2*[php5-fpm]
     |-puppet---{puppet}
     |-rpc.idmapd
     |-rpc.statd
     |-rpcbind
     |-rsyslogd---3*[{rsyslogd}]
     |-ruby---{ruby}
     |-sshd-+-3*[sshd---sshd---sftp-server]
     |      |-sshd---sshd---2*[sftp-server]
     |      |-sshd---sshd---bash---pstree
     |      `-sshd---sshd---bash---python---echo
     |-systemd-logind
     |-systemd-udevd
     |-upstart-file-br
     |-upstart-socket-
     `-upstart-udev-br

Notice the difference between this 注意这之间的区别

 |      `-sshd---sshd---bash---python-+-323*[echo]
 |                                    `-5*[python]

and this 和这个

 |      `-sshd---sshd---bash---python---echo

in one cas it seems that processes are not closed as soon as completed. 在一个cas中,似乎没有在完成后立即关闭进程。

I have tested this issue on 4 different machines: 我已经在4种不同的机器上测试了这个问题:

  • Ubuntu 14.04 Ubuntu 14.04
  • Centos 6 Centos 6
  • Centos 7 Centos 7

The issue is exactly the same. 问题是完全一样的。

To give a shot, try run watch -n 0.1 "pstree" to see how the processes are evolving. 要尝试一下,请尝试运行watch -n 0.1 "pstree"来查看watch -n 0.1 "pstree"如何发展。


edit: I get it why this is happening thanks to Glyph answer, but how to adapt this to my real life case ? 编辑:我知道为什么这要归功于Glyph的回答,但是如何使其适应我的现实生活呢?

The application I am developing with Twisted is an SMTP filter based on Milter, here how it works (assume we want to check the email signature): 我正在使用Twisted开发的应用程序是一个基于Milter的SMTP过滤器,下面是它的工作原理(假设我们要检查电子邮件签名):

  • connection opens on port 25 连接在端口25上打开
  • milter protocol get all emails details milter协议获取所有电子邮件详细信息
  • milter calls a remote "module" server that will handle the signature check with /usr/bin/openssl mime call milter调用远程“模块”服务器,该服务器将使用/usr/bin/openssl mime调用来处理签名检查
  • the module will return an answer indicating whether or not the signature is valid 模块将返回一个答案,指示签名是否有效

In this case, my problem is that is I get 150 simultaneous connections, there will be 150 calls to the module (TCP protocol) and this module will invoke the openssl command once per connection. 在这种情况下,我的问题是我获得了150个同时连接,将有150个对模块的调用(TCP协议),并且该模块将为每个连接调用openssl命令一次。

The module is completely agnostic, therefore will not know if other calls are running. 该模块是完全不可知的,因此将不知道是否正在运行其他调用。 Where should I put the DeferredSemaphore in your opinion ? 我应该把DeferredSemaphore放在哪里?

My problem here is that smtp connections are also agnostics and don't know about other possible opens connections. 我的问题是smtp连接也是不可知论的,并且不知道其他可能的打开连接。

What is the correct way of handling this parallellism in your opinion? 您认为处理这种并行性的正确方法是什么?

The problem here has nothing to do with the distinction between task.react and reactor.run , but rather, the subtle but significant difference between the implementation of your run and main functions. 这里的问题有没有关系之间的区别task.reactreactor.run ,而是你的实现之间的微妙而显著差异runmain功能。

The difference is that run is spawning limit processes in parallel , racking up thousands of simultaneous open file descriptors, easily blowing through your system's limitations. 区别在于run并行生成 limit进程,同时堆积成千上万个同时打开的文件描述符,从而很容易突破系统的限制。 However, main is waiting for every process to completely finish executing before even starting up the next one, which means it never uses more than 4 or 5 at a time. 但是, main正在等待每个进程完全完成执行,甚至没有启动下一个进程,这意味着它永远不会一次使用超过4或5。

The reason is that main is decorated by inlineCallbacks and yields every getProcessOutputAndValue Deferred , which suspends execution of main until that Deferred has completed. 原因是maininlineCallbacks装饰并产生每个getProcessOutputAndValue Deferred ,这将暂停main执行,直到Deferred完成。

In real applications, neither of these approaches is ideal. 在实际应用中,这些方法都不是理想的。 You want some parallelism, but not unlimited. 您需要一些并行性,但不是无限的。 Twisted comes with some utilities, such as DeferredSemaphore , to facilitate limited parallelism without restricting everything to only run one task at a time. Twisted附带了一些实用程序,例如DeferredSemaphore ,以促进有限的并行性,而不限制所有内容一次仅运行一个任务。 Jean-Paul Calderone wrote an article - 10 years ago! Jean-Paul Calderone写了一篇文章-10年前! - that explains how to use this, here . - 在此处说明如何使用此功能。

However, just to demonstrate that the issue has nothing to do with task.react , here's a modified version of your example which eliminates the run function and makes an apples-to-apples comparison using main : 但是,仅为了说明问题与task.react ,这是示例的修改版本,该版本消除了run函数,并使用main进行了一个苹果对苹果的比较:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from progress.bar import IncrementalBar
from twisted.internet import defer
from twisted.internet import task
from twisted.internet import utils
from twisted.python import usage


class Options(usage.Options):
    optFlags = [['reactor', 'r', 'Use reactor.run().'],
                ['task', 't', 'Use task.react().'],
                ['cwr', 'w', 'Use callWhenRunning().']]
    optParameters = [['limit', 'l', 255, 'Number of file descriptors to open.'],
                     ['cmd', 'c', 'echo Testing {i}...', 'Command to run.']]


@defer.inlineCallbacks
def main(reactor, opt):
    d = defer.Deferred()
    limit = int(opt['limit'])
    cmd, args = opt['cmd'].split(' ', 1)
    bar = IncrementalBar('Running {cmd}'.format(cmd=opt['cmd']), max=limit)
    for i in range(0, limit):
        try:
            _args = args.format(i=i)
            args = _args
        except KeyError:
            pass
        yield utils.getProcessOutputAndValue('echo', [args])
        bar.next()
    bar.finish()
    defer.returnValue(d.callback(True))


if __name__ == '__main__':
    opt = Options()
    opt.parseOptions()

    if opt['reactor']:
        from twisted.internet import reactor
        task.deferLater(reactor, 0, main, reactor, opt)
        reactor.run()

    elif opt['task']:
        from twisted.internet.task import react
        react(main, [opt])

    elif opt['cwr']:
        from twisted.internet import reactor
        reactor.callWhenRunning(main, reactor, opt)
        reactor.run()

edit, responding to edit in the question: 编辑,回答问题中的编辑:

Since your real problem is with incoming connections, and not just a for loop, rather than using DeferredSemaphore , you might instead need to maintain a counter, and take advantage of the fact that the object returned from listenTCP , or the result of the Deferred that comes back from TCP4ServerEndpoint , implements IPushProducer , and call pauseProducing() on it when too many concurrent connections are doing work, and resumeProducing() when that work is done. 由于真正的问题是传入连接,而不只是for循环,而不是使用DeferredSemaphore ,因此您可能需要维护一个计数器,并利用对象从listenTCP返回的事实或Deferred的结果。从TCP4ServerEndpoint ,实现IPushProducer ,并在有太多并发连接进行工作时在其上调用pauseProducing()在完成并发工作时resumeProducing()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 OSError:[Errno 24]打开的文件太多 - OSError: [Errno 24] Too many open files OSError: [Errno 24] 使用 Nibabel 打开的文件太多 - OSError: [Errno 24] Too many open files using Nibabel slackclient OSError:[Errno 24]打开的文件太多 - slackclient OSError: [Errno 24] Too many open files OSError: [Errno 24] 打开的文件太多 - OS Mojave - OSError: [Errno 24] Too many open files - OS Mojave OSError:[Errno 24]太多打开的文件python,ubuntu - OSError: [Errno 24] Too many open files python , ubuntu OSError: [Errno 24] 通过 Django admin 上传 9000+ csv 个文件时打开的文件太多 - OSError: [Errno 24] Too many open files when uploading 9000+ csv files through Django admin OSError: [Errno 24] 在 tensorflow-federated 中训练差分隐私时打开的文件过多 - OSError: [Errno 24] Too many open files when training differential privacy in tensorflow-federated OSError:[Errno 24]从终端调用脚本时打开的文件太多 - OSError: [Errno 24] Too many open files when invoking script from terminal 为什么在使用mrjob v0.4.4时,[Errno 7]参数列表过长且OSError:[Errno 24]打开的文件太多? - Why am I getting [Errno 7] Argument list too long and OSError: [Errno 24] Too many open files when using mrjob v0.4.4? OSError: [Errno 24] 打开的文件太多:'/dev/null' 或没有 IP 显示在我连接的网络上 - OSError: [Errno 24] Too many open files: '/dev/null' or No IP is showing on my connected network
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM