简体   繁体   English

为每个芹菜工人创建单独的数据库连接

[英]Creating separate database connection for every celery worker

I keep running into wierd mysql issues while workers executing tasks just after creation. 当工作人员在创建之后执行任务时,我一直在遇到奇怪的mysql问题。

We use django 1.3, celery 3.1.17, djorm-ext-pool 0.5 我们使用django 1.3,芹菜3.1.17,djorm-ext-pool 0.5

We start celery process with concurrency 3. My obeservation so far is, when the workers process start, they all get same mysql connecition. 我们用并发启动芹菜进程3.到目前为止我的观察是,当工作进程启动时,它们都得到相同的mysql连接。 We log db connection id as below. 我们记录数据库连接ID,如下所示。

from django.db import connection
connection.cursor()
logger.info("Task %s processing with db connection %s", str(task_id), str(connection.connection.thread_id()))

When all the workers get tasks, the first one executes successfully but the other two gives weird Mysql errors. 当所有工作人员获得任务时,第一个工作成功执行,但另外两个工人发出奇怪的Mysql错误。 It either errors with "Mysql server gone away", or with a condition where Django throws "DoesNotExist" error. 它或者是“Mysql服务器消失”的错误,或者是Django抛出“DoesNotExist”错误的情况。 clearly the objects that Django is querying do exist. 显然Django查询的对象确实存在。

After this error, each worker starts getting its own database connection after which we don't find any issue. 在此错误之后,每个工作程序开始获得自己的数据库连接,之后我们没有发现任何问题。

What is the default behavior of celery ? 芹菜的默认行为是什么? Is it designed to share same database connection. 它是否旨在共享相同的数据库连接。 If so how is the inter process communication handled ? 如果是这样,如何处理进程间通信? I would ideally prefer different database connection for each worker. 理想情况下,我希望每个工作者都有不同的数据库连接

I tried the code mentioned in below link which did not work. 我尝试了下面链接中提到的代码,但是没有用。 Celery Worker Database Connection Pooling 芹菜工人数据库连接池

We have also fixed the celery code suggested below. 我们还修复了下面建议的芹菜代码。 https://github.com/celery/celery/issues/2453 https://github.com/celery/celery/issues/2453

For those who downvote the question, kindly let me know the reason for downvote. 对于那些提出问题的人,请让我知道downvote的原因。

Celery is started with below command Celery以下面的命令启动

celery -A myproject worker --loglevel=debug --concurrency=3 -Q testqueue

myproject.py as part of the master process was making some queries to mysql database before forking the worker processes. 作为主进程的一部分, myproject.py在分配工作进程之前对mysql数据库进行了一些查询。

As part of query flow in main process, django ORM creates a sqlalchemy connection pool if it does not already exist. 作为主进程中查询流的一部分,django ORM创建一个sqlalchemy连接池(如果它尚不存在)。 Worker processes are then created. 然后创建工作进程。

Celery as part of django fixups closes existing connections. 芹菜作为django fixups的一部分关闭现有的连接。

    def close_database(self, **kwargs):
    if self._close_old_connections:
        return self._close_old_connections()  # Django 1.6
    if not self.db_reuse_max:
        return self._close_database()
    if self._db_recycles >= self.db_reuse_max * 2:
        self._db_recycles = 0
        self._close_database()
    self._db_recycles += 1

In effect what could be happening is that, the sqlalchemy pool object with one unused db connection gets copied to the 3 worker process when forked. 实际上,可能发生的情况是,具有一个未使用的数据库连接的sqlalchemy池对象在分叉时被复制到3个工作进程。 So the 3 different pools have 3 connection objects pointing to the same connection file descriptor. 因此,3个不同的池有3个连接对象指向相同的连接文件描述符。

Workers while executing the tasks when asked for a db connection, all the workers get the same unused connection from sqlalchemy pool because that is currently unused. 当工作人员在被要求进行数据库连接时执行任务时,所有工作人员都从sqlalchemy池获得相同的未使用连接,因为当前未使用该连接。 The fact that all the connections point to the same file descriptor has caused the MySQL connection gone away errors. 所有连接指向同一文件描述符的事实导致MySQL连接消失了。

New connections created there after are all new and don't point to the same socket file descriptor. 之后创建的新连接都是新的,并且不指向相同的套接字文件描述符。

Solution: 解:

In the main process add 在主要过程中添加

from django.db import connection
connection.cursor()

before any import is done. 在任何导入完成之前。 ie before even djorm-ext-pool module is added. 即使在添加djorm-ext-pool模块之前。

That way all the db queries will use connection created by django outside the pool. 这样,所有数据库查询都将使用池外的django创建的连接。 When celery django fixup closes the connection, the connection actually gets closed as opposed to going back to the alchemy pool leaving the alchemy pool with no connections in it at the time of coping over to all the workers when forked. 当芹菜django fixup关闭连接时,连接实际上是关闭的,而不是回到炼金术池,在分叉时应对所有工人时,炼金池没有连接。 There after when workers ask for db connection, sqlalchemy returns one of the newly created connections. 在工人要求数据库连接之后,sqlalchemy返回一个新创建的连接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM