简体   繁体   中英

Why does python generate sigpipe exception on closing a fifo file?

TL;DR: Why does closing a fifo file (named pipe) that received a SIGPIPE exception generate another SIGPIPE exception?

My python script is writing bytes to another process, which is a subprocess of my python process, through a FIFO file. (There are some restrictions that I must use a named pipe.)

I have to take account the fact that the subprocess might terminate prematurely. If that happens, my python script must reap the dead subprocess and start it again.

To see whether the subprocess dies, I simply try to write to the FIFO first, and if I get a SIGPIPE exception (actually IOError indicating broken pipe), I know it is time to restart my subprocess.

The minimum example goes as follows:

#!/usr/bin/env python3
import os
import signal
import subprocess

# The FIFO file.
os.mkfifo('tmp.fifo')

# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)

# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')

# Endlessly write to the FIFO.
while True:

    # Try to write to the FIFO, restart the subprocess on demand, until succeeded.
    while True:
        try:
            # Optimistically write to the FIFO.
            fifo.write(b'hello')
        except IOError as e:
            # The subprocess died. Close the FIFO and reap the subprocess.
            fifo.close()
            os.kill(proc.pid, signal.SIGKILL)
            proc.wait()

            # Start the subprocess again.
            proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
            print('pid = %d' % proc.pid)
            fifo = open('tmp.fifo', 'wb')
        else:
            # The write goes on well.
            break

To reproduce the result, run that script and manually kill the subprocess by kill -9 <pid> . The traceback will tell that

Traceback (most recent call last):
  File "./test.py", line 24, in <module>
    fifo.write(b'hello')
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./test.py", line 27, in <module>
    fifo.close()
BrokenPipeError: [Errno 32] Broken pipe

So why does closing the FIFO file generate another SIGPIPE exception?

I ran the test on the following platforms and the results are same.

Python 3.7.6 @ Darwin Kernel Version 19.3.0 (MacOS 10.15.3)
Python 3.6.8 @ Linux 4.18.0-147.3.1.el8_1.x86_64 (Centos 8)

It is because Python won't clear the write buffer when fifo.write fails. So the buffer will be wrote to the broken pipe again when executing fifo.close , which causes the second SIGPIPE .

I found the reason with the help of strace . Here are some details.

First, modify a small part of that Python code, as following,

#!/usr/bin/env python3
import os
import signal
import subprocess

# The FIFO file.
os.mkfifo('tmp.fifo')

# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)

# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')

i = 0
# Endlessly write to the FIFO.
while True:

    # Try to write to the FIFO, restart the subprocess on demand, until succeeded.
    while True:
        try:
            # Optimistically write to the FIFO.
            fifo.write(f'hello{i}'.encode())
            fifo.flush()
        except IOError as e:
            # The subprocess died. Close the FIFO and reap the subprocess.
            print('IOError is occured.')
            fifo.close()
            os.kill(proc.pid, signal.SIGKILL)
            proc.wait()

            # Start the subprocess again.
            proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
            print('pid = %d' % proc.pid)
            fifo = open('tmp.fifo', 'wb')
        else:
            # The write goes on well.
            break
    os.kill(proc.pid, signal.SIGKILL)
    i += 1

and save it as test.py .

Then run strace -o strace.out python3 test.py in the shell. Check the strace.out and we can find something like

openat(AT_FDCWD, "tmp.fifo", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
fstat(4, {st_mode=S_IFIFO|0644, st_size=0, ...}) = 0
ioctl(4, TCGETS, 0x7ffcba5cd290)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
write(4, "hello0", 6)                   = 6
kill(35626, SIGKILL)                    = 0
write(4, "hello1", 6)                   = 6
kill(35626, SIGKILL)                    = 0
write(4, "hello2", 6)                   = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=35626, si_uid=1000, si_status=SIGKILL, si_utime=0, si_stime=0} ---
write(1, "IOError is occured.\n", 20)   = 20
write(4, "hello2", 6)                   = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
close(4)                                = 0
write(2, "Traceback (most recent call last"..., 35) = 35
write(2, "  File \"test.py\", line 26, in <m"..., 39) = 39

Note that Python tried to write hello2 twice, during fifo.flush and fifo.close respectively. The output explains why two SIGPIPE exceptions are generated well.

In order to solve the problem, we can use open('tmp.fifo', 'wb', buffering=0) to disable the write buffer. Then only one SIGPIPE exception will be generated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM