Resource temporarily unavailable error with subprocess module in Python

Question

In Python, I spawn a gnuplot process to generate gif images from a data set.

from subprocess import Popen, PIPE
def gnuplotter(...)
    p = Popen([GNUPLOT], shell=False, stdin=PIPE, stdout=PIPE)
    p.stdin.write(r'set terminal gif;')
    ...
    p.stdin.write(contents)
    p.stdout.close()

It works fine when I use gnuplotter() one time, but when I launch the process multiple times, I got Resource temporarily unavailable error.

for i in range(54):
    gnuplotter(i, ... 

  File "/Users/smcho/code/PycharmProjects/contextAggregator/aggregation_analyzer/aggregation_analyzer/gnuplotter.py", line 48, in gnuplotter
    p = Popen([GNUPLOT], shell=False, stdin=PIPE, stdout=PIPE)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1205, in _execute_child
    self.pid = os.fork()
OSError: [Errno 35] Resource temporarily unavailable

What's wrong, and how can I close gnuplot process before spewing another one?

Answer 1

pid numbers, open file descriptors, memory are limited resources.

fork(2) manual says when errno.EAGAIN should happen:

[EAGAIN]  The system-imposed limit on the total number of processes under
          execution would be exceeded.  This limit is configuration-dependent.

[EAGAIN]  The system-imposed limit MAXUPRC () on the total number of processes
          under execution by a single user would be exceeded.

To reproduce the error more easily, you could add at the start of your program:

import resource

resource.setrlimit(resource.RLIMIT_NPROC, (20, 20))

The issue might be that all child processes are alive because you haven't called p.stdin.close() and gnuplot's stdin might be fully buffered when redirected to a pipe ie, gnuplot processes might be stuck awaiting input. And/or your application uses too many file descriptors (file descriptors are inherited by child processes by default on Python 2.7) without releasing them.

If input doesn't depend on the output and the input is limited in size then use .communicate() :

from subprocess import Popen, PIPE, STDOUT

p = Popen("gnuplot", stdin=PIPE, stdout=PIPE, stderr=PIPE,
          close_fds=True, # to avoid running out of file descriptors
          bufsize=-1, # fully buffered (use zero (default) if no p.communicate())
          universal_newlines=True) # translate newlines, encode/decode text
out, err = p.communicate("\n".join(['set terminal gif;', contents]))

.communicate() writes all input and reads all output (concurrently, so there is no deadlock) then closes p.stdin, p.stdout, p.stderr (even if input is small and gnuplot's side is fully buffered; EOF flushes the buffer) and waits for the process to finish (no zombies).

Popen calls _cleanup() in its constructor that polls exit status of all known subprocesses ie, even if you won't call p.wait() there shouldn't be many zombie processes (dead but with unread status).

Answer 2

You ~~need to~~ should call p.wait() to wait for the subprocess to finish, and then collect it, after you are done communicating with it.

If you have special situations (where you want to start N and wait for them later), p.poll() will let you check whether one has finished.

Since you have pipes set up, you should be using p.communicate() to avoid deadlocks. See the documentation .

Resource temporarily unavailable error with subprocess module in Python

Question

2 answers

solution1
4 ACCPTED 2014-03-29 10:03:15

solution2
1 2014-03-29 00:04:16

Resource temporarily unavailable error with subprocess module in Python

Question

2 answers

solution1 4 ACCPTED 2014-03-29 10:03:15

solution2 1 2014-03-29 00:04:16

solution1
4 ACCPTED 2014-03-29 10:03:15

solution2
1 2014-03-29 00:04:16