简体   繁体   中英

Create a process from a function that will run in parallel in Python

I have a function that executes a SELECT sql query (using postgresql). Now, I want to INSERT to some table in my DB the execution time of this query, however, I want to do it in parallel, so that even if my INSERT query is still running I will be able to continue my program and call other functions.

I tries to use multiprocessing.Process , however, my function is waiting for the process to finish and I'm actually losing the effect of the parallelism I wanted.

My code in a nut shell:

def select_func():
    with connection.cursor() as cursor:
        query = "SELECT * FROM myTable WHERE \"UserName\" = 'Alice'"
        start = time.time()
        cursor.execute(query)
        end = time.time()
        process = Process(target = insert_func, args = (query, (end-start)))
        process.start()
        process.join()
        return cursor.fetchall()
        
def insert_func(query, time):
    with connection.cursor() as cursor:
        query = "INSERT INTO infoTable (\"query\", \"exec_time\")
                VALUES (\"" + query  + "\", \"" + time + "\")"
        cursor.execute(query)
        connection.commit()

Now the problem is that this operation is not really async, since select_func is waiting until insert_function is finished. I want that the execution of these functions won't be depended and that the select function could end even though insert_function is still running so that I will be able to continue and call other function in my script.

Thanks!

Quite a lot of issues with your code snippet but lets try to at least give a structure to implement.

def select_func():
    with connection.cursor() as cursor: #I dont think the same global variable connectino should be used for read/write simultaneously
        query = "SELECT * FROM myTable WHERE \"UserName\" = 'Alice'" #quotation issues
        start = time.time()
        cursor.execute(query)
        end = time.time()
        process = Process(target = insert_func, args = (query, (end-start)))
        process.start() #you start the process here BUT
        process.join() #you force python to wait for it here....
        return cursor.fetchall()
        
def insert_func(query, time):
    with connection.cursor() as cursor:
        query = "INSERT INTO infoTable (\"query\", \"exec_time\")
                VALUES (\"" + query  + "\", \"" + time + "\")"
        cursor.execute(query)
        connection.commit()

Consider an alternative:

def select_func():
    read_con = sql.connect() #sqlite syntax but use your connection
    with read_con.cursor() as cursor:
        query = "SELECT * FROM myTable WHERE \"UserName\" = 'Alice'" #where does Alice come from? 
        start = time.time()
        cursor.execute(query)
        end = time.time()
        return cursor.fetchall(),(query,(end-start)) #Our tuple has query at position 0 and time at position 1


def insert_function(insert_queue): #The insert you want to parallleize
    
    connection = sql.connect("db") #initialize your 'writer'. Note: May be good to initialize the connection on each insert. Not sure if optimal. 
    while True: #We keep pulling from the pipe
        data = insert_queue.get() # we pull from our pipe
        if data == 'STOP': #Example of a kill instruction to stop our process
            break #breaks the while loop and the function can 'exit'
         
        with connection.cursor() as cursor:
            query_data = data #I assume you would want to pass your query through the pipe
            query= query_data[0] #see how we stored the tuple
            time = query_data[1] #as above
            insert_query = "INSERT INTO infoTable (\"query\", \"exec_time\")
                VALUES (\"" + query  + "\", \"" + time + "\")" #Somehow query and time goes into the insert_query
            cursor.execute(insert_query)
        connection.commit()
            
        
if __name__ == '__main__': #Typical python main thread

    query_pipe = Queue() #we initialize a Queue here to feed into your inserting function
    process = Process(target = insert_func,args = (query_pipe,)
    process.start() 
    stuff = []
    for i in range(5):
        data,insert_query = select_function() #select function lets say it gets the data you want to insert. 
        stuff.append(data)
        query_pipe.put(insert_query)
    #
    #Do other stuff and even put more stuff into the pipe.
    #
    query_pipe.put('STOP') #we wanna kill our process so we send the stop command
    process.join()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM