简体   繁体   中英

How can I make Ansible fail when the systemd service fails to start?

I have a systemd service that I deploy and want to be started by Ansible.

My systemd service unit file is this:

[Unit]
Description=Collector service
After=network.target mariadb.service
Requires=mariadb.service

[Service]
Type=simple
ExecStart=/opt/collector/app.py
WorkingDirectory=/opt/collector
Restart=on-abort
User=root

[Install]
WantedBy=multi-user.target

I am using Type=simple since this looks like the correct solution (also the preferred one in this Question ).

I tried using Type=oneshot as well (as suggested by the initial user making this question as duplicate of this question ) but the problem is that the /opt/collector/app.py script is a long running process:

while True:
    t = threading.Thread(...)
    t.start()
    t.join()
    time.sleep(15)

and with Type=oneshot , Ansible will block forever .

And my Ansible starting code is:

- name: start Collector service
  systemd:
    name: collector
    state: started
    enabled: yes

On the target system, systemctl will display:

[root@srv01 /]# systemctl
  UNIT                           LOAD   ACTIVE     SUB       DESCRIPTION
  dev-sda1.device                loaded activating tentative /dev/sda1
  -.mount                        loaded active     mounted   /
  dev-mqueue.mount               loaded active     mounted   POSIX Message Queue File System
  etc-hostname.mount             loaded active     mounted   /etc/hostname
  etc-hosts.mount                loaded active     mounted   /etc/hosts
  etc-resolv.conf.mount          loaded active     mounted   /etc/resolv.conf
  run-user-0.mount               loaded active     mounted   /run/user/0
  session-73.scope               loaded active     running   Session 73 of user root
  crond.service                  loaded active     running   Command Scheduler
  dbus.service                   loaded active     running   D-Bus System Message Bus
  haproxy.service                loaded active     running   HAProxy Load Balancer
<E2><97><8F> collector.service          loaded failed     failed   Collector service
....

The service fails because of the Python process exception (using un undefined variable).

But my Ansible playbook run does not fail:

TASK [inventory : start Collector service] *********************************
changed: [srv01]

I tried with both systemd and service Ansible modules and the behavior is the same.

How can I make Ansible:

  • fail when the systemd unit fails to start?
  • not block and systemd getting in active running status with a while True process?

I stumbled over this while I had the same problem with silently failing serives. I also found a bug report describing this issue and after some research I found a workaround:

- name: start Collector service
  systemd:
    name: collector
    state: started
    enabled: yes

- name: make sure Collector service is really running
  command: systemctl is-active collector

Note that for Type=simple services this will only fail if the service itself fails immediately after it was started.

You can use failed_when example:

- name:  validating processes started correctly
  shell: pgrep toto| wc -l
  register: after_count
  failed_when: after_count.stdout_lines[0] == "1"

The failed_when will fail a task if the number of processes returned is not == 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM