简体   繁体   中英

Python MultiThreading Queue suddenly stops doing anything

I have a file that contains 600K+ lines of stuff I want to process.
So I use multithreading to speed up the process.
But the problem is for example I use 50 as the number of threads, after processing 50 lines the script just does nothing else. It doesnt terminate nor show anything else.

This is my code for reference:

#!/usr/bin/env python

from __future__ import print_function
import re
import sys
from Queue import *
from threading import Thread, Lock

#struct parameters
if len(sys.argv) != 3:  # the program name and the two arguments
    # stop the program and print an error message
    sys.exit("Usage: python " + sys.argv[0] + " filename maxthreads")

accountlist = sys.argv[1]
maxthreads = int(sys.argv[2])

def dojob(email, password):
    #here is some job to process all my users data
    #end dojob

#this function will process the items in the queue, in serial
def processor():
    if queue.empty() == True:
        print ("the Queue is empty!")
        sys.exit(1)
    try:
        job = queue.get()
        job = job.strip('\r\n')

        newdata = job.split(':')

        email = newdata[0]
        password = newdata[1]

        #pass to dojob and process
        print("Processing:", email)

        dojob(email, password)

        queue.task_done()

    except:
        print ("Failed to operate on job")

#set variables
queue = Queue()
threads = maxthreads

#a list of job items. you would want this to be more advanced,like reading from a file or database
jobs = open(accountlist)

#iterate over jobs and put each into the queue in sequence
for job in jobs:
    print ("inserting job into the queue:", job)
    queue.put(job)

#start some threads, each one will process one job from the queue
for i in range(threads):
    th = Thread(target=processor)
    th.setDaemon(True)
    th.start()

#wait until all jobs are processed before quitting
queue.join() 

Any idea's why its doing just stopping the process.

Sample output:

 #for example thread is 2
 inserting job into queue: user@domain.com
 inserting job into queue: user2@domain.com
 inserting job into queue: another@domain.com
 (...until the end of the file...)
 #once everything was added to the queue, is starts processing.
 processing: user@domain.com
 processing: user2@domain.com
 #then here the problem occurs, it doesnt do anything else.
 #it doesnt continue to the next queued job.

It sounds like you need a loop inside processor() :

def processor():
    while not queue.empty():
        try:
            job = queue.get()
            ...

Otherwise, every thread processes one job and stops.

I use multithreading to speed up the process.

Depending on the nature of the processing, you may or may not get any speedup from using multiple threads. This has to do with the Global Interpreter Lock (GIL) . If you find that you're not getting any speedup due to the GIL, you might want to consider using the multiprocessing module.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM