简体   繁体   中英

How to detect a hung linux service?

I have noticed on some of my linux servers that a linux service will be hung. The only way I know that it is hung is operations that rely on the service fails and when I restart the service it fails to stop but it starts fine.

If I do service <servicename> status it says its running, If I do a ps -ef | grep <servicename> ps -ef | grep <servicename> it only shows one process running for that service which is correct.

Anything else I can check to know if it is hung or not? I am trying to be proactive about bringing these service(s) back up and also determining why they are getting hung.

For reference the services are mostly openstack-nova-compute and openstack-cinder-volume. The cinder volume service I can detect with the rabbitMQ starting to build up but the same thing doesn't happen for nova-compute.

This is very hard to test because like I said the only way I know is if I try to do something on that node in OpenStack and it fails or gets hung, and then I restart the service.

You could use some tool (a script or even a "real" monitoring tool like Nagios) to do exactly what you said - mimick those "operations that rely on the service" - which means trying to contact the regarding service, and on fail, will send some kind of notification! (Or even restart it automatically.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM