简体   繁体   中英

Creating a threaded python queue?

How would I go and create a queue to run tasks in the background in Python?

I have tried via asyncio.Queue() but whenever I use Queue.put(task) it immediately starts the task.

It is for an application which receives an unknown amount of entries (filenames) from a database on a specified time interval. What I wish to accomplish with this backgroundqueue would be that the python application keeps running and keeps returning new filenames. Everytime the application finds new filenames it should handle them by creating a task, which would contain (method(variables)). These tasks should all be thrown into an ever expanding queue which runs the tasks on its own. Here's the code.

class DatabaseHandler:
def __init__(self):
    try:
        self.cnx = mysql.connector.connect(user='root', password='', host='127.0.0.1', database='mydb')
        self.cnx.autocommit = True
        self.q = asyncio.Queue()
    except mysql.connector.Error as err:
        if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
            print("Something is wrong with your user name or password")
        elif err.errno == errorcode.ER_BAD_DB_ERROR:
            print("Database does not exist")
        else:
            print(err)
    self.get_new_entries(30.0)

def get_new_entries(self, delay):
    start_time = t.time()
    while True:
        current_time = datetime.datetime.now() - datetime.timedelta(seconds=delay)
        current_time = current_time.strftime("%Y-%m-%d %H:%M:%S")
        data = current_time
        print(current_time)
        self.select_latest_entries(data)
        print("###################")
        t.sleep(delay - ((t.time() - start_time) % delay))

def select_latest_entries(self, input_data):
    query = """SELECT FILE_NAME FROM `added_files` WHERE CREATION_TIME > %s"""
    cursor = self.cnx.cursor()
    cursor.execute(query, (input_data,))
    for file_name in cursor.fetchall():
        file_name_string = ''.join(file_name)
        self.q.put(self.handle_new_file_names(file_name_string))
    cursor.close()

def handle_new_file_names(self, filename):
    create_new_npy_files(filename)
    self.update_entry(filename)

def update_entry(self, filename):
    print(filename)
    query = """UPDATE `added_files` SET NPY_CREATED_AT=NOW(), DELETED=1 WHERE FILE_NAME=%s"""
    update_cursor = self.cnx.cursor()
    self.cnx.commit()
    update_cursor.execute(query, (filename,))
    update_cursor.close()

As I said, this will instantly run the task.

create_new_npy_files is a pretty time consuming method in a static class.

There are two problems with this expression:

self.q.put(self.handle_new_file_names(file_name_string))

First, it is actually calling the handle_new_file_names method and is enqueueing its result . This is not specific to asyncio.Queue , it is how function calls work in Python (and most mainstream languages). The above is equivalent to:

_tmp = self.handle_new_file_names(file_name_string)
self.q.put(_tmp)

The second problem is that asyncio.Queue operations like get and put are coroutines , so you must await them.

If you want to enqueue a callable, you can use a lambda :

await self.q.put(lambda: self.handle_new_file_names(file_name_string))

But since the consumer of the queue is under your control, you can simply enqueue the file names, as suggested by @dirn:

await self.q.put(file_name_string)

The consumer of the queue would use await self.q.get() to read the file names and call self.handle_new_file_names() on each.

If you plan to use asyncio, consider reading a tutorial that covers the basics, and switching to an asyncio compliant database connector, so that the database queries play along with the asyncio event loop.

For people who see this in the future. The answer I marked as accepted is the explanation of how to solve the problem. I'll write down some code which I used to create what I wanted. That is, tasks that should run in the background. Here you go.

from multiprocessing import Queue
import threading

class ThisClass
    def __init__(self):
        self.q = Queue()
        self.worker = threading.Thread(target=self._consume_queue)
        self.worker.start()
        self.run()

The queue created is not a queue for tasks, but for the variables you want to handle.

def run(self):
    for i in range(100):
        self.q.put(i)

Then for the _consume_queue() , which consumes the items in the queue when there are items:

def _consume_queue(self):
    while True:
        number = self.q.get()
        # the logic you want to use per number.

It seems the self.q.get() waits for new entries, even when there are none.

The -simplified- code above works for me, I hope it will also work for others.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM