简体   繁体   中英

Monit starting/restarting processes in serial

I'm using monit to start up a bunch of workers for running qless, a queueing system that we use for background jobs. My setup has 20 files like this in /etc/monit/conf.d:

check process qless-1 with pidfile /srv/app/shared/tmp/pids/qless-1.pid
  start program = "/bin/bash -c 'cd /srv/app/current && RAILS_ENV=prod2 BUNDLE_GEMFILE=/srv/app/current/Gemfile QUEUES=jobs /usr/local/rbenv/shims/bundle exec rake -f /srv/app/current/Rakefile qless:work_with_pidfile[/srv/app/shared/tmp/pids/qless-1.pid] >> /srv/app/shared/log/qless-1.log 2>&1'"
  stop  program = "/bin/bash -c '/bin/kill `/bin/cat /srv/app/shared/tmp/pids/qless-1.pid`'"

Each file references its own pid file. It takes about a minute to boot the environment and get the app running, and we frequently need to restart them. The problem is that monit seems to always start/restart things in serial. This means it takes about 20 minutes for all of the workers to come online and a similar amount of time for everything to be restarted. Isn't monit all about running things in parallel? I can't believe that this is the correct behavior, so what crazy thing might I be doing wrong? Thanks!

You should convert your start script in a asynchronous call, sending it to background. Then use with timeout to instruct monit to not poll your service while it's starting up: You should consider too using service groups, so you can stop all your processes doing:

check process qless-1 with pidfile /srv/app/shared/tmp/pids/qless-1.pid
  start program = "call_to_async_script"  with timeout 60 seconds
  stop  program = "/bin/bash -c '/bin/kill `/bin/cat /srv/app/shared/tmp/pids/qless-1.pid`'"
  GROUP qless

Then, you can start and stop all services at once:

monit stop qless

I had the same problem and I believe I figured it out.

What happens is monit waits for the pid file to appear before proceeding to start the other processes. Due to the slowness of bundle exec and rails loading, it will take a lot of time for the rake task to actually get to writing out the pidfile.

The fix is to put the rake task in the background, write out the pidfile from within the start script immediately and detach:

start program = "/bin/bash -c 'cd /srv/app/current && RAILS_ENV=prod2 BUNDLE_GEMFILE=/srv/app/current/Gemfile QUEUES=jobs /usr/local/rbenv/shims/bundle exec rake -f /srv/app/current/Rakefile qless:work >> /srv/app/shared/log/qless-1.log 2>&1 & echo $! > /srv/app/shared/tmp/pids/qless-1.pid; detach'"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM