Rabbitmq on high load: Socket.error [Errno 104] Connection reset by peer

Question

I have been using celery with rabbitmq as backend. Whenever I send a high load (around 600-1000) tasks to rabbitmq, I get following error socket.error [Errno 104] Connection reset by peer

A sample command which I have been using is:

for i in {1..500}; do python client.py queue_name time_out bash -c "sleep 20 && touch folder/$i" & done
for i in {1..500}; do python client.py different_queue_name time_out bash -c "sleep 20 && touch folder/$i" & done

Here client.py sends a task which executes the given bash command at worker and polls the result for time_out seconds.

I also tried sending load over an interval of time using this command. It still gives the same error

for i in {1..10}; do for i in {1..50}; do python client.py queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done
for i in {1..10}; do for i in {1..50}; do python client.py different_queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done

What is causing this behaviour and what can I do to handle this situation?

Answer 1

=WARNING REPORT== file descriptor limit alarm set. means that your reach the filedescriptor limit.

You should tuning your OS and RabbitMQ.

Here a few link you should follow:

https://www.rabbitmq.com/production-checklist.html

Open File Handles Limit Operating systems limit maximum number of concurrently open file handles, which includes network sockets. Make sure that you have limits set high enough to allow for expected number of concurrent connections and queues.

Make sure your environment allows for at least 50K open file descriptors for effective RabbitMQ user, including in development environments.

As a rule of thumb, multiple the 95th percentile number of concurrent connections by 2 and add total number of queues to calculate recommended open file handle limit. Values as high as 500K are not inadequate and won't consume a lot of hardware resources, and therefore are recommended for production setups. See Networking guide for more information.

https://www.rabbitmq.com/networking.html

Erlang VM I/O Thread Pool Erlang runtime uses a pool of threads for performing I/O operations asynchronously. The size of the pool is configured via the +A VM command line flag, eg +A 128. We highly recommend overriding the flag using the RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS environment variable:

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128" Default value is 30. Nodes that have 8 or more cores available are recommended to use values higher than 96, that is, 12 or more I/O threads for every core available. Note that higher values do not necessarily mean better throughput or lower CPU burn due to waiting on I/O. Tuning for a Large Number of Connections

Some workloads, often referred to as "the Internet of Things", assume a large number of client connections per node, and a relatively low volume of traffic from each node. One such workload is sensor networks: there can be hundreds of thousands or millions of sensors deployed, each emitting data every several minutes. Optimising for the maximum number of concurrent clients can be more important than for total throughput.

Several factors can limit how many concurrent connections a single node can support:

Number of open file handles (including sockets) Amount of RAM used by each connection Amount of CPU resources used by each connection

Hope it helps

Rabbitmq on high load: Socket.error [Errno 104] Connection reset by peer

Question

1 answers

solution1
2 ACCPTED 2016-01-13 09:18:11

Rabbitmq on high load: Socket.error [Errno 104] Connection reset by peer

Question

1 answers

solution1 2 ACCPTED 2016-01-13 09:18:11

solution1
2 ACCPTED 2016-01-13 09:18:11