简体繁体 English

Pymongo，连接池和通过Celery进行的异步任务

[英]Pymongo, connection pooling and asynchronous tasks via Celery

原文 2011-08-08 19:58:19 9 1 python/ django/ connection-pooling/ celery/ pymongo

I'm using pymongo to access mongodb in an application that also uses Celery to perform many asynchronous tasks. 我正在使用pymongo访问还使用Celery执行许多异步任务的应用程序中的mongodb。 I know pymongo's connection pooling does not support asynchronous workers (based on the docs). 我知道pymongo的连接池不支持异步工作程序（基于文档）。

To access collections I've got a Collection class wrapping certain logic that fits my application. 要访问集合，我有一个Collection类，其中包含适合我的应用程序的某些逻辑。 I'm trying to make sense of some code that I inherited with this wrapper: 我试图弄清楚我使用此包装器继承的一些代码：

Each collection at the moment creates its own Connection instance. 目前，每个集合都创建自己的Connection实例。 Based on what I'm reading this is wrong and I should really have a single Connection instance (in settings.py or such) and import it into my Collection instances. 基于我正在阅读的内容，这是错误的，我应该确实有一个Connection实例（在settings.py或类似的实例中）并将其导入到我的Collection实例中。 That bit is clear. 那一点很清楚。 Is there a guideline as far as the maximum connections recommended? 是否有关于建议的最大连接数的准则？ The current code surely creates a LOT of connections/sockets as its not really making use of the pooling facilities. 当前代码肯定会创建很多连接/套接字，因为它没有真正利用池化工具。
However, as some code is called from both asynchronous celery tasks as well as being run synchronously, I'm not sure how to handle this. 但是，由于既从异步celery任务中调用了一些代码，又使它们同步运行，所以我不确定该如何处理。 My thought is to instantiate new Connection instances for the tasks and use the single one for for the synchronous ones (ending_request of course after each activity is done). 我的想法是为这些任务实例化新的Connection实例，并将单个实例用于同步实例（当然，每个活动完成后，ending_request）。 Is this the right direction? 这是正确的方向吗？

Thanks! 谢谢！

Harel 哈雷尔

1 个解决方案

From pymongo's docs : "PyMongo is thread-safe and even provides built-in connection pooling for threaded applications." 来自pymongo的文档：“ PyMongo是线程安全的，甚至为线程应用程序提供了内置的连接池。”

The word "asynchronous" in your situation can be translated into how "inconsistent" requirements your application has. 在您所处的情况下，“异步”一词可以翻译为您的应用程序具有“不一致”的要求。

Statements like "x += 1" will never be consistent in your app. 像“ x + = 1”这样的语句在您的应用中永远不会保持一致。 If you can afford this, there is no problem. 如果您负担得起，那就没有问题了。 If you have "critical" operations you must somehow implement some locks for synchronization. 如果您具有“关键”操作，则必须以某种方式实现一些同步锁。

As for the maximum connections, I don't know exact numbers, so test and proceed. 至于最大连接数，我不知道确切数字，因此请测试并继续。

Also take a look at Redis and this example , if speed and memory efficiency are required. 如果需要速度和内存效率，还请看一下Redis和此示例。 From some benchmarks I made, Redis python driver is at least 2x faster than pymongo, for reads/writes. 从我做的一些基准测试来看，Redis python驱动程序的读/写速度至少比pymongo快2倍。