簡體   English   中英

Celery 工作人員斷開與代理的連接

[英]Celery worker disconnects to broker

我正在使用PythonRabbitMQCelery將任務分配給工作人員。 每個任務大約需要 15 分鍾,並且 99% 受 CPU 限制。 我的系統有 24 核,每當我的工作人員執行此任務時,我都會收到與代理的連接錯誤。

[2019-10-12 08:49:57,695: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
[...]
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

我發現了其他幾個與此問題有關的帖子,但沒有一個解決了這個問題。 特別是在 CPU 負載很重的情況下,知道如何解決這個問題嗎?

Windows 10(工人)

macOS 10.14(RabbitMQ 服務器)

Python 3.7

Celery 4.3.0 (大黃)

RabbitMQ 3.7.16 (Erlang 22.0.7 )

我的配置讓工作人員一次只消耗 1 個任務,即使工作進程在每個工作后重新啟動,仍然沒有運氣:

CELERYD_MAX_TASKS_PER_CHILD = 1,
CELERYD_CONCURRENCY = 1,
CELERY_TASK_RESULT_EXPIRES=3600,
CELERYD_PREFETCH_MULTIPLIER = 1,
CELERY_MAX_CACHED_RESULTS = 1,
CELERY_ACKS_LATE = True,

這是整個調用堆棧:

[2019-10-12 08:49:57,695: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\celery\worker\consumer\consumer.py", line 318, in start
    blueprint.start(self)
File "C:\Python37\lib\site-packages\celery\bootsteps.py", line 119, in start
    step.start(parent)
File "C:\Python37\lib\site-packages\celery\worker\consumer\consumer.py", line 596, in start
    c.loop(*c.loop_args())
File "C:\Python37\lib\site-packages\celery\worker\loops.py", line 118, in synloop
    qos.update()
File "C:\Python37\lib\site-packages\kombu\common.py", line 442, in update
    return self.set(self.value)
File "C:\Python37\lib\site-packages\kombu\common.py", line 435, in set
    self.callback(prefetch_count=new_value)
File "C:\Python37\lib\site-packages\celery\worker\consumer\tasks.py", line 47, in set_prefetch_count
    apply_global=qos_global,
File "C:\Python37\lib\site-packages\kombu\messaging.py", line 558, in qos
    apply_global)
File "C:\Python37\lib\site-packages\amqp\channel.py", line 1853, in basic_qos
    wait=spec.Basic.QosOk,
File "C:\Python37\lib\site-packages\amqp\abstract_channel.py", line 68, in send_method
    return self.wait(wait, returns_tuple=returns_tuple)
File "C:\Python37\lib\site-packages\amqp\abstract_channel.py", line 88, in wait
    self.connection.drain_events(timeout=timeout)
File "C:\Python37\lib\site-packages\amqp\connection.py", line 504, in drain_events
    while not self.blocking_read(timeout):
File "C:\Python37\lib\site-packages\amqp\connection.py", line 509, in blocking_read
    frame = self.transport.read_frame()
File "C:\Python37\lib\site-packages\amqp\transport.py", line 252, in read_frame
    frame_header = read(7, True)
File "C:\Python37\lib\site-packages\amqp\transport.py", line 438, in _read
    s = recv(n - len(rbuf))
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

我找到了解決這個問題的方法。 我覺得問題出在 celery 后端。 就我而言,我使用的是 redis。

以下是我的配置

Broker - rabbitmq
Backend - redis
Python - 3.7
OS - Windows 10

在 celery 客戶端上,我嘗試從客戶端每 60 秒 ping 一次工作人員的 celery 狀態。 在這種情況下,我沒有遇到連接重置問題。

while not doors_res.ready():
    sleep(60)
result = app.get()

其中 app 是 celery 實例。

在 celery 工人側

celery worker -A <celery_file_name> -l info -P gevent

我的任務運行了大約 2 個小時,我沒有遇到連接重置錯誤。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM