[英]Using Multiprocessing concurrency mechanisms in Celery tasks
我正在嘗試與只能接受單個TCP連接(內存約束)的設備進行接口連接,因此無法為每個工作線程啟動連接,因為在正常的客戶端-服務器情況下(如數據庫連接),這是不可行的。
我嘗試使用可在線程之間全局訪問的Multiprocessing Manager dict,格式為:
clients{(address, port): (connection_obj, multiprocessing.Manager.RLock)}
像這樣的任務:
from celery import shared_task
from .celery import manager, clients
@shared_task
def send_command(controller, commandname, args):
"""Send a command to the controller."""
# Create client connection if one does not exist.
conn = None
addr, port = controller
if controller not in clients:
conn = Client(addr, port)
conn.connect()
lock = manager.RLock()
clients[controller] = (conn, lock,)
print("New controller connection to %s:%s" % (addr, port,))
else:
conn, lock = clients[controller]
try:
f = getattr(conn, commandname) # See if connection.commandname() exists.
except Exception:
raise Exception("command: %s not known." % (commandname))
with lock:
res = f(*args)
return res
但是,任務將因序列化錯誤而失敗,例如:
_pickle.PicklingError: Can't pickle <class '_thread.lock'>: attribute lookup lock on _thread failed
即使我沒有使用不可序列化的值調用任務,並且該任務也未嘗試返回不可序列化的值,但Celery似乎對嘗試序列化此全局對象感到着迷?
我想念什么? 您將如何在Celery任務中使用客戶端設備連接線程安全且可在線程之間訪問? 示例代碼?
...
self._send_bytes(ForkingPickler.dumps(obj))
File "/usr/lib64/python3.4/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class '_thread.lock'>: attribute lookup lock on _thread failed
在瀏覽互聯網后,我意識到我可能錯過了追溯中的重要內容。 在仔細觀察回溯之后,我意識到Celery不是試圖腌制連接對象而是使用Multiprocessing.reduction。 減少用於一側串行並在另一側重建。
我有幾種替代方法來解決此問題-但是,它們都沒有真正實現我最初想要的,就是借用Client庫連接對象並使用它,而Multiprocessing和prefork則無法實現。
如何使用Redis實現分布式鎖管理器? Redis python客戶端具有內置的鎖定功能。 另外,請參閱redis.io上的此文檔 。 即使您使用RabbitMQ或其他經紀人,Redis也非常輕量級。
例如,作為裝飾器:
from functools import wraps
def device_lock(block=True):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
return_value = None
have_lock = False
lock = redisconn.lock('locks.device', timeout=2, sleep=0.01)
try:
have_lock = lock.acquire(blocking=block)
if have_lock:
return_value = func(*args, **kwargs)
finally:
if have_lock:
lock.release()
return return_value
return wrapper
return decorator
@shared_task
@device_lock
def send_command(controller, commandname, args):
"""Send a command to the controller."""
...
您也可以使用Celery任務指南中的這種方法 :
from celery import task
from celery.utils.log import get_task_logger
from django.core.cache import cache
from hashlib import md5
from djangofeeds.models import Feed
logger = get_task_logger(__name__)
LOCK_EXPIRE = 60 * 5 # Lock expires in 5 minutes
@task(bind=True)
def import_feed(self, feed_url):
# The cache key consists of the task name and the MD5 digest
# of the feed URL.
feed_url_hexdigest = md5(feed_url).hexdigest()
lock_id = '{0}-lock-{1}'.format(self.name, feed_url_hexdigest)
# cache.add fails if the key already exists
acquire_lock = lambda: cache.add(lock_id, 'true', LOCK_EXPIRE)
# memcache delete is very slow, but we have to use it to take
# advantage of using add() for atomic locking
release_lock = lambda: cache.delete(lock_id)
logger.debug('Importing feed: %s', feed_url)
if acquire_lock():
try:
feed = Feed.objects.import_feed(feed_url)
finally:
release_lock()
return feed.url
logger.debug(
'Feed %s is already being imported by another worker', feed_url)
您是否嘗試過使用gevent或eventlet celery worker來代替進程和線程? 在這種情況下,您將可以使用全局var或threading.local()共享連接對象。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.