简体   繁体   中英

Python Multiprocessing - Too Slow

I have built a multiprocessing password cracker (using a wordlist) for a specific function, it halved the time needed compared to using a single process.

The original problem being that it would show you the cracked password and terminate the worker, but the remaining workers would carry on until they ran out of words to hash! not ideal.

My new step forward is to use Manager.Event() to terminate the remaining workers, this works as I had hoped (after some trial and error), but the application now takes far longer that it would take as a single process, I'm sure this must be due to the if function inside pwd_find() but I thought I would seek some advice.

#!/usr/bin/env python

import hashlib, os, time, math
from hashlib import md5
from multiprocessing import Pool, cpu_count, Manager

def screen_clear(): # Small function for clearing the screen on Unix or Windows
    if os.name == 'nt':
        return os.system('cls')
    else:
        return os.system('clear')

cores = cpu_count() # Var containing number of cores (Threads)

screen_clear()

print ""
print "Welcome to the Technicolor md5 cracker"
print ""

user = raw_input("Username: ")
print ""
nonce = raw_input("Nonce: ")
print ""
hash = raw_input("Hash: ")
print ""
file = raw_input("Wordlist: ")
screen_clear()
print "Cracking the password for \"" + user + "\" using " 
time1 = time.time() # Begins the 'Clock' for timing

realm = "Technicolor Gateway" # These 3 variables dont appear to change
qop = "auth"
uri = "/login.lp"

HA2 = md5("GET" + ":" + uri).hexdigest() # This hash doesn't contain any changing variables so doesn't need to be recalculated

file = open(file, 'r') # Opens the wordlist file
wordlist = file.readlines() # This enables us to use len()
length = len(wordlist)

screen_clear()
print "Cracking the password for \"" + user + "\" using " + str(length) + " words"

break_points = []  # List that will have start and stopping points
for i in range(cores):  # Creates start and stopping points based on length of word list
    break_points.append({"start":int(math.ceil((length+0.0)/cores * i)), "stop":int(math.ceil((length+0.0)/cores * (i + 1)))})

def pwd_find(start, stop, event):
    for number in range(start, stop):
        if not event.is_set():
            word = (wordlist[number])
            pwd = word.replace("\n","") # Removes newline character
            HA1 = md5(user + ":" + realm + ":" + pwd).hexdigest()
            hidepw = md5(HA1 + ":" + nonce +":" + "00000001" + ":" + "xyz" + ":" + qop + ":" + HA2).hexdigest()
            if hidepw == hash:
                screen_clear()
                time2 = time.time() # stops the 'Clock'
                timetotal = math.ceil(time2 - time1) # Calculates the time taken
                print "\"" + pwd + "\"" + " = " + hidepw + " (in " + str(timetotal) + " seconds)"
                print ""
                event.set()
                p.terminate
                p.join
        else:
            p.terminate
            p.join

if __name__ == '__main__':  # Added this because the multiprocessor module sometimes acts funny without it.

    p = Pool(cores)  # Number of processes to create.
    m = Manager()
    event = m.Event()
    for i in break_points:  # Cycles though the breakpoints list created above.
        i['event'] = event
        a = p.apply_async(pwd_find, kwds=i, args=tuple())  # This will start the separate processes.
    p.close() # Prevents any more processes being started
    p.join() # Waits for worker process to end

if event.is_set():
    end = raw_input("hit enter to exit")
    file.close() # Closes the wordlist file
    screen_clear()
    exit()
else:
    screen_clear()
    time2 = time.time() # Stops the 'Clock'
    totaltime = math.ceil(time2 - time1) # Calculates the time taken
    print "Sorry your password was not found (in " + str(totaltime) + " seconds) out of " + str(length) + " words"
    print ""
    end = raw_input("hit enter to exit")
    file.close() # Closes the wordlist file
    screen_clear()
    exit()

Edit (for @noxdafox):

def finisher(answer):
    if answer:
        p.terminate()
        p.join()
        end = raw_input("hit enter to exit")
        file.close() # Closes the wordlist file
        screen_clear()
        exit()

def pwd_find(start, stop):
    for number in range(start, stop):
        word = (wordlist[number])
        pwd = word.replace("\n","") # Removes newline character
        HA1 = md5(user + ":" + realm + ":" + pwd).hexdigest()
        hidepw = md5(HA1 + ":" + nonce +":" + "00000001" + ":" + "xyz" + ":" + qop + ":" + HA2).hexdigest()
        if hidepw == hash:
            screen_clear()
            time2 = time.time() # stops the 'Clock'
            timetotal = math.ceil(time2 - time1) # Calculates the time taken
            print "\"" + pwd + "\"" + " = " + hidepw + " (in " + str(timetotal) + " seconds)"
            print ""
            return True
        elif hidepw != hash:
            return False

if __name__ == '__main__':  # Added this because the multiprocessor module sometimes acts funny without it.

    p = Pool(cores)  # Number of processes to create.
    for i in break_points:  # Cycles though the breakpoints list created above.
        a = p.apply_async(pwd_find, kwds=i, args=tuple(), callback=finisher)  # This will start the separate processes.
    p.close() # Prevents any more processes being started
    p.join() # Waits for worker process to end

I think your hunch is correct. You are checking a synchronization primitive inside a fast loop. I would maybe only check if the event is set every so often. You can experiment to find the sweet spot where you check it enough to not do too much work but not so often that you slow the program down.

You can use the Pool primitives to solve your problem. You don't need to share an Event object which access is synchronised and slow.

Here I give an example on how to terminate a Pool given the desired result from a worker.

You can simply signal the Pool by returning a specific value and terminate the pool within a callback.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM