简体   繁体   English

创建一个线程python队列?

[英]Creating a threaded python queue?

How would I go and create a queue to run tasks in the background in Python? 我将如何创建队列以在Python中在后台运行任务?

I have tried via asyncio.Queue() but whenever I use Queue.put(task) it immediately starts the task. 我尝试过asyncio.Queue(),但每当我使用Queue.put(任务)时,它立即启动任务。

It is for an application which receives an unknown amount of entries (filenames) from a database on a specified time interval. 它适用于在指定时间间隔内从数据库接收未知数量的条目(文件名)的应用程序。 What I wish to accomplish with this backgroundqueue would be that the python application keeps running and keeps returning new filenames. 我希望用这个backgroundqueue实现的是python应用程序继续运行并不断返回新的文件名。 Everytime the application finds new filenames it should handle them by creating a task, which would contain (method(variables)). 每次应用程序找到新的文件名时,都应该通过创建一个包含(方法(变量))的任务来处理它们。 These tasks should all be thrown into an ever expanding queue which runs the tasks on its own. 这些任务都应该被抛入一个不断扩展的队列中,该队列自己运行任务。 Here's the code. 这是代码。

class DatabaseHandler:
def __init__(self):
    try:
        self.cnx = mysql.connector.connect(user='root', password='', host='127.0.0.1', database='mydb')
        self.cnx.autocommit = True
        self.q = asyncio.Queue()
    except mysql.connector.Error as err:
        if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
            print("Something is wrong with your user name or password")
        elif err.errno == errorcode.ER_BAD_DB_ERROR:
            print("Database does not exist")
        else:
            print(err)
    self.get_new_entries(30.0)

def get_new_entries(self, delay):
    start_time = t.time()
    while True:
        current_time = datetime.datetime.now() - datetime.timedelta(seconds=delay)
        current_time = current_time.strftime("%Y-%m-%d %H:%M:%S")
        data = current_time
        print(current_time)
        self.select_latest_entries(data)
        print("###################")
        t.sleep(delay - ((t.time() - start_time) % delay))

def select_latest_entries(self, input_data):
    query = """SELECT FILE_NAME FROM `added_files` WHERE CREATION_TIME > %s"""
    cursor = self.cnx.cursor()
    cursor.execute(query, (input_data,))
    for file_name in cursor.fetchall():
        file_name_string = ''.join(file_name)
        self.q.put(self.handle_new_file_names(file_name_string))
    cursor.close()

def handle_new_file_names(self, filename):
    create_new_npy_files(filename)
    self.update_entry(filename)

def update_entry(self, filename):
    print(filename)
    query = """UPDATE `added_files` SET NPY_CREATED_AT=NOW(), DELETED=1 WHERE FILE_NAME=%s"""
    update_cursor = self.cnx.cursor()
    self.cnx.commit()
    update_cursor.execute(query, (filename,))
    update_cursor.close()

As I said, this will instantly run the task. 正如我所说,这将立即执行任务。

create_new_npy_files is a pretty time consuming method in a static class. create_new_npy_files是静态类中非常耗时的方法。

There are two problems with this expression: 这个表达式有两个问题:

self.q.put(self.handle_new_file_names(file_name_string))

First, it is actually calling the handle_new_file_names method and is enqueueing its result . 首先,它实际上是调用 handle_new_file_names方法并将其结果 handle_new_file_names This is not specific to asyncio.Queue , it is how function calls work in Python (and most mainstream languages). 这不是特定于asyncio.Queue ,它是函数调用在Python(以及大多数主流语言)中的工作方式。 The above is equivalent to: 以上相当于:

_tmp = self.handle_new_file_names(file_name_string)
self.q.put(_tmp)

The second problem is that asyncio.Queue operations like get and put are coroutines , so you must await them. 第二个问题是像getput这样的asyncio.Queue操作是协同程序 ,所以你必须等待它们。

If you want to enqueue a callable, you can use a lambda : 如果要将可调用队列入队列,可以使用lambda

await self.q.put(lambda: self.handle_new_file_names(file_name_string))

But since the consumer of the queue is under your control, you can simply enqueue the file names, as suggested by @dirn: 但由于队列的使用者在您的控制之下,您可以简单地将文件名排入队列,如@dirn所示:

await self.q.put(file_name_string)

The consumer of the queue would use await self.q.get() to read the file names and call self.handle_new_file_names() on each. 队列的使用者将使用await self.q.get()来读取文件名并在每个上调用self.handle_new_file_names()

If you plan to use asyncio, consider reading a tutorial that covers the basics, and switching to an asyncio compliant database connector, so that the database queries play along with the asyncio event loop. 如果您打算使用asyncio,请考虑阅读涵盖基础知识的教程 ,并切换到符合asyncio的数据库连接器,以便数据库查询与asyncio事件循环一起播放。

For people who see this in the future. 对于将来看到这一点的人。 The answer I marked as accepted is the explanation of how to solve the problem. 我标记为接受的答案是如何解决问题的解释。 I'll write down some code which I used to create what I wanted. 我会写下一些我用来创建我想要的代码。 That is, tasks that should run in the background. 也就是说,应该在后台运行的任务。 Here you go. 干得好。

from multiprocessing import Queue
import threading

class ThisClass
    def __init__(self):
        self.q = Queue()
        self.worker = threading.Thread(target=self._consume_queue)
        self.worker.start()
        self.run()

The queue created is not a queue for tasks, but for the variables you want to handle. 创建的队列不是任务的队列,而是您要处理的变量。

def run(self):
    for i in range(100):
        self.q.put(i)

Then for the _consume_queue() , which consumes the items in the queue when there are items: 然后是_consume_queue() ,当有项目时,它会消耗队列中的项目:

def _consume_queue(self):
    while True:
        number = self.q.get()
        # the logic you want to use per number.

It seems the self.q.get() waits for new entries, even when there are none. 似乎self.q.get()等待新条目,即使没有条目也是如此。

The -simplified- code above works for me, I hope it will also work for others. 上面的-simplified-代码对我有用,我希望它也适用于其他人。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM