简体   繁体   中英

python watchdog for threads

Im writing simple app, which reads (about a million) lines from file, copy those lines into list, and if next line will be different then previous it runs a thread, to do some job with that list. Thread job is based on tcp sockets, sending and receiving commands via telnet lib.

Sometimes my application hangs and does nothing. All telnet operations I wrapped into try-except statements, also read and write into sockets has timeouts.

I thought about writing watchdog, which will do sys.exit() or something similiar on that hang condtition. But, for now I'm thinking how to create it, and still got no idea how to do it. So if You can trace me, it would be great.

For that file I'm creating 40 threads. Pseudo code looks:

lock = threading.Lock()
no_of_jobs = 0

class DoJob(threading.Thread):
    def start(self, cond, work):
        self.work = work
        threading.Thread.start(self)
    def run(self)
        global lock
        global no_of_jobs
        lock.acquire()
        no_of_jobs += 1
        lock.release()

        # do some job, if error or if finished, decrement no_of_jobs under lock
        (...)
main:
#starting conditions:
with open(sys.argv[1]) as targetsfile:
    head = [targetsfile.next() for x in xrange(1)]
    s = head[0]

    prev_cond = s[0]
    work = []

for line in open(sys.argv[1], "r"):
    cond = line([0])
    if prev_cond != cond:
       while(no_of_jobs>= MAX_THREADS):
           time.sleep(1)

       DoJob(cond, work)
       prev_cond = cond
       work = None
       work = []
     work.append(line)

#last job:
       DoJob(cond, work)

while threading.activeCount() > 1:
    time.sleep(1)

best regards J

I have successfully used code like below in the past (from a python 3 program I wrote):

import threading

def die():
    print('ran for too long. quitting.')
    for thread in threading.enumerate():
            if thread.isAlive():
                    try:
                            thread._stop()
                    except:
                            pass
    sys.exit(1)


if __name__ == '__main__':
    #bunch of app-specific code...

    # setup max runtime
    die = threading.Timer(2.0, die) #quit after 2 seconds
    die.daemon = True
    die.start()

    #after work is done
    die.cancel()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM