简体   繁体   English

Python 多处理 - 构建导致脚本挂起的东西?

[英]Python multiprocessing - Build up of something causing script to hang?

I have a function that scrolls an image, and a subprocess within it that generates the next image behind the scenes to display, so that the transition between the displaying image and the new image is seamless.我有一个滚动图像的 function,其中有一个子进程在幕后生成下一个要显示的图像,以便显示图像和新图像之间的过渡是无缝的。

The script works fine at the start.该脚本在开始时运行良好。 However, after a few hours of using this script, the subprocess eventually gets stuck on a write call.但是,在使用此脚本几个小时后,子进程最终会卡在写入调用上。 Killing the subprocesses doesn't help, as the next subprocess spawned also gets stuck.杀死子进程没有帮助,因为产生的下一个子进程也会卡住。 So it definitely means something is building up and is not clearing up in the script, but I'm not sure what.所以这绝对意味着脚本中的某些东西正在建立并且没有清理,但我不确定是什么。

The only way to fix this is to kill the parent process, and run it again.解决此问题的唯一方法是终止父进程,然后再次运行它。

   def updateMultiple(self, options):
        # print('options', options)
        for option in options:
            
            
            if option not in ['Custom GIFs', 'Custom Images', 'Custom Messages']: # these images are already saved in user uploads, dodnt need to update them 
                img = self.functions[option]()
                img.save('./display_images/'+ option+ '.ppm')

def scrollFunctionsAnimated(self, options, animation = 'down', repeat = True):
        
        # scrolls trhough all functions with animation. Updates functions and remakes images when each function not being dispplayed 
        
        self.updateMultiple([options[0]])
 
        
        kill = False 
        i = 0 # keep track of which image we are displaying
        self.double_buffer = self.matrix.CreateFrameCanvas()
        
        while True:
            
            update_process = Process(target = self.updateMultiple, args = ([options[(i+1) % len(options)]],))
            update_process.start()
            update_process.join()
            
            
            settings = json.load(open(self.JSONs[options[(i) % len(options)]]))
        
            self.set_delay(settings['speed'])
            animation = settings['animation'].lower()
            
            if options[i % len(options)]  == 'Custom Images':
                images = self.getUserImages()
                
            elif options[i % len(options)]  == 'Custom GIFs':
                images = self.getUserGIFs()
                
            elif options[i % len(options)]  == 'Custom Messages':
                images = self.getUserMessages()
                
            
            else: #these options just have a single ppm image
                image = self.openImage('./display_images/' + options[i % len(options)] +'.ppm')
                
                images = [image]
                
            
            for image in images:
                img_width, img_height = image.size
                
                offset_x = 0
                if animation == 'continuous':
                    offset_x = 128
                elif animation in ['up', 'down']:
                    offset_x = max(0, 128-img_width)
                
                offset_y = 0
                #first scroll image in from bottom
                
                
                
                frame_skip = int((1/15)/self.delay) #controls how fast gifs run 
                self.frame = 0
                
                pause_frames = int(0.5/self.delay)
                if animation == 'up':
                    offset_y = 33
                    direction = -1
                    kill = self.scrollImageY(image, direction = direction, offset_x = offset_x, offset_y = offset_y, frame_skip = frame_skip, gif = options[i % len(options)] == 'Custom GIFs')
                elif animation == 'down':
                    direction = 1
                    offset_y = -33
                    kill = self.scrollImageY(image, direction = direction, offset_x = offset_x, offset_y = offset_y, frame_skip = frame_skip, gif = options[i % len(options)] == 'Custom GIFs')
                    
                if kill: break
                offset_y = 0
                
                
                if animation in ['up', 'down']:
                    while pause_frames > 0:
                        if pause_frames%frame_skip  == 0:
                            self.incrementGIF(image)
                        
                        pause_frames -=1
                        if options[i % len(options)] != 'Custom GIFs':
                            self.double_buffer.SetImage(image, offset_x, offset_y)
                        else:
                            self.double_buffer.SetImage(image.convert('RGB'), offset_x, offset_y)
                    
                        self.double_buffer = self.matrix.SwapOnVSync(self.double_buffer)
                        
                        time.sleep(self.delay)
                        kill = self.checkKilled()
                        if kill: break
                    
                    
                    if kill: break
                    
                           
                if kill: break
                try: 
                    pause = float(settings['pause'])
                except:
                    pause = 0
                pause_frames = int(float(pause)/self.delay) 
                kill = self.scrollImage(image, offset_x = offset_x, offset_y = offset_y, frame_skip = frame_skip, gif = options[i % len(options)] == 'Custom GIFs', pause_frames = pause_frames)
                
                if kill: break
            if kill:break
            if not repeat:
                break
            update_process.join()
            update_process.terminate()
            i+=1

File descriptors of the hanging subprocess:挂起子进程的文件描述符:

挂起子进程的文件描述符

在此处输入图像描述

I also have logs from strace below:我也有来自 strace 下面的日志:

This is what it looks like when the script is working normally.这是脚本正常运行时的样子。

strace: Process 1693 attached
[pid  1693] close(4)                    = 0 <0.000028>
[pid  1693] openat(AT_FDCWD, "/dev/null", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 4 <0.000077>
[pid  1693] write(1, "options ['Forex']\n", 18) = 18 <0.000048>
[pid  1693] openat(AT_FDCWD, "csv/forex_settings.json", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000064>
[pid  1693] read(6, "{\"feature\": \"Stocks\", \"speed\": \""..., 348) = 347 <0.000034>
[pid  1693] read(6, "", 1)              = 0 <0.000025>
[pid  1693] close(6)                    = 0 <0.000039>
[pid  1693] openat(AT_FDCWD, "feature_titles/forex.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000062>
[pid  1693] read(6, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0`\0\0\0 \4\3\0\0\0\37\356`"..., 4096) = 388 <0.000035>
[pid   412] close(5)                    = 0 <0.000064>
[pid   412] openat(AT_FDCWD, "csv/crypto_settings.json", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 5 <0.000143>
[pid   412] read(5, "{\"feature\": \"Stocks\", \"speed\": \""..., 349) = 348 <0.000060>
[pid   412] read(5, "", 1)              = 0 <0.000040>
[pid   412] close(5)                    = 0 <0.000097>
[pid   412] openat(AT_FDCWD, "./display_images/Crypto.ppm", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 5 <0.000125>
[pid  1693] close(6)                    = 0 <0.000027>
[pid   412] read(5, "P6\n569 32\n255\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000052>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/stocks/down-1.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000154>
[pid   412] read(5, "P6\n569 32\n255\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000073>
[pid   412] read(5,  <unfinished ...>
[pid  1693] read(6, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\20\0\0\0\16\4\3\0\0\0\324\1\201"..., 4096) = 199 <0.000035>
[pid   412] <... read resumed> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 61440) = 50542 <0.000554>
[pid   412] read(5, "", 8192)           = 0 <0.000047>
[pid   412] close(5)                    = 0 <0.000095>
[pid  1693] close(6)                    = 0 <0.000044>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/currencies/EUR.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000074>
[pid  1693] read(6, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\26\0\0\0\26\10\3\0\0\0\363j\234"..., 4096) = 968 <0.000040>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/currencies/USD.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 7 <0.000146>
[pid  1693] read(7, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\26\0\0\0\26\10\3\0\0\0\363j\234"..., 4096) = 984 <0.000038>
[pid  1693] close(7)                    = 0 <0.000056>
[pid  1693] close(6)                    = 0 <0.000042>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/stocks/up-1.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000068>
[pid  1693] read(6, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\20\0\0\0\16\4\3\0\0\0\324\1\201"..., 4096) = 192 <0.000039>
[pid  1693] close(6)                    = 0 <0.000045>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/currencies/USD.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 6 <0.000112>
[pid  1693] read(6, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\26\0\0\0\26\10\3\0\0\0\363j\234"..., 4096) = 984 <0.000039>
[pid  1693] openat(AT_FDCWD, "/home/pi/logos/currencies/JPY.png", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 7 <0.000151>
[pid  1693] read(7, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\0\26\0\0\0\26\10\3\0\0\0\363j\234"..., 4096) = 769 <0.000047>
[pid  1693] close(7)                    = 0 <0.000060>
[pid  1693] close(6)                    = 0 <0.000043>
[pid  1693] openat(AT_FDCWD, "./display_images/Forex.ppm", O_RDWR|O_CREAT|O_TRUNC|O_LARGEFILE|O_CLOEXEC, 0666) = 6 <0.000359>
[pid  1693] write(6, "P6\n507 32\n255\n", 14) = 14 <0.000104>
[pid  1693] write(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 48672) = 48672 <0.000269>
[pid  1693] close(6)                    = 0 <0.000400>
[pid  1693] +++ exited with 0 +++
[pid   412] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1693, si_uid=1, si_status=0, si_utime=5, si_stime=9} ---
[pid   412] close(4)                    = 0 <0.000064>

About 40-60 minutes later of running the Python script, here are the strace results of the hanging subprocess .运行 Python 脚本大约 40-60 分钟后,这里是挂起子进程的 strace 结果。 PID 2220 seems to be stuck with the write(1, "options ['Forex']\n", 18 <unfinished...> syscall? How can I check what is causing it to hang? PID 2220 似乎被 write(1, "options ['Forex']\n", 18 <unfinished...> 系统调用卡住了?如何检查导致它挂起的原因?

strace: Process 2220 attached
[pid   412] close(5)                    = 0 <0.000048>
[pid   412] openat(AT_FDCWD, "csv/crypto_settings.json", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 5 <0.000163>
[pid   412] read(5, "{\"feature\": \"Stocks\", \"speed\": \""..., 349) = 348 <0.000071>
[pid  2220] close(4 <unfinished ...>
[pid   412] read(5,  <unfinished ...>
[pid  2220] <... close resumed> )       = 0 <0.000083>
[pid   412] <... read resumed> "", 1)   = 0 <0.000105>
[pid  2220] openat(AT_FDCWD, "/dev/null", O_RDONLY|O_LARGEFILE|O_CLOEXEC <unfinished ...>
[pid   412] close(5 <unfinished ...>
[pid  2220] <... openat resumed> )      = 4 <0.000137>
[pid   412] <... close resumed> )       = 0 <0.000136>
[pid   412] openat(AT_FDCWD, "./display_images/Crypto.ppm", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 5 <0.000099>
[pid   412] read(5, "P6\n569 32\n255\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000046>
[pid  2220] write(1, "options ['Forex']\n", 18 <unfinished ...>
[pid   412] read(5, "P6\n569 32\n255\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <0.000079>
[pid   412] read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 61440) = 50542 <0.000322>
[pid   412] read(5, "", 8192)           = 0 <0.000067>
[pid   412] close(5)                    = 0 <0.000048>
[pid  2220] <... write resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <9.479316>
[pid  2220] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=412, si_uid=1} ---
[pid  2220] +++ killed by SIGTERM +++
[pid   412] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=2220, si_uid=1, si_status=SIGTERM, si_utime=0, si_stime=1} ---
[pid   412] close(4)                    = 0 <0.000218> 

It should be hang in the following line,它应该挂在下面一行,

[pid 2220] write(1, "options ['Forex']\n", 18 <unfinished ...>

you can lookup use cat /proc/<pid>/stack to it callstack, check it a few times to see if it has changed, if it doesn't change, it means it's hanging你可以使用cat /proc/<pid>/stack来查找它的callstack,检查几次看它是否有变化,如果没有变化,说明它挂了

fd = 1 -> /dev/pts/0, the stdout is the remote terminal, it is recommended that you redirect to the file when running python, eg: fd = 1 -> /dev/pts/0,stdout为远程终端,建议运行python时重定向到该文件,eg:

python test.py >test.log 2>&1 0>/dev/null
Example of running to the background and redirect运行到后台并重定向的示例
root@dev8:~/tmp# cat test.py
import time
import sys

while True:
    print(time.time())
    sys.stdout.flush()
    time.sleep(1)
    
root@dev8:~/tmp# python test.py >test.log 2>&1 0>/dev/null &
[1] 4203

root@dev8:~/tmp# ls -l /proc/4203/fd
total 0
l-wx------ 1 root root 64 Apr  6 19:16 0 -> /dev/null
l-wx------ 1 root root 64 Apr  6 19:16 1 -> /root/tmp/test.log
l-wx------ 1 root root 64 Apr  6 19:16 2 -> /root/tmp/test.log
root@dev8:~/tmp# head test.log
1649243765.043044
1649243766.0584888
1649243767.1079023

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM