简体   繁体   中英

Debugging Open multiprocessing.Pipe()

I have a rather large program with multiple instances of multiprocessing.Pipe() being used. I'll refer to this program as test_pipe. I'm seeing behavior where the test_pipe program hangs, and I'm trying to determine the reason the program is hanging. I have a suspicion that it's caused by a multiprocessing.Pipe() that is not being closed properly. As shown below, I use 'ps aux' to retrieve the PID of the test_pipe program while it is hung, and then I use 'lsof -p PID' to list the files that are still open by the test_pipe program. The test_pipe program lists two open pipes as shown.

root@container:~# ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         49  1.8  0.2 1020008 40068 ?       Sl   Jun06  19:57 python3 -m test.test_pipe

root@container:~# lsof -p 49
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
test_pipe  21 root    1w  FIFO   0,13      0t0 19391346 pipe
test_pipe  21 root    2w  FIFO   0,13      0t0 19391347 pipe

My question is, given the pipe entries in the lsof output, is there any way to match these entries to the multiprocessing.Pipe() instances in my test_pipe program? For example, if there was a node() function of a multiprocessing Connection, then I could just log each connection's node and that would provide me with enough info on which of the many pipes within the test_pipe program to focus on.

from multiprocessing import Pipe, Process
parent_conn, child_conn = Pipe()
print(f"Parent pipe has node: {parent_conn.node()}")    # node() is NOT a real function, but just used for demonstration purposes

node() is not a real function, but is there maybe a way to derive this information from the Connection's fileno() function? I'm also happy to move away from this lsof-based approach if there are other suggestions. My main goal is the quickly narrow in on a problem Pipe given a large program that uses Pipes in many places

ls /proc/<pid>/fd will show all the file descriptors open in a process, and the fileno() method will return the number you need to correlate the connection object within the process.

More info

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM