[英]How can I make Ansible fail when the systemd service fails to start?
I have a systemd service that I deploy and want to be started by Ansible.我有一个我部署的 systemd 服务,并希望由 Ansible 启动。
My systemd service unit file is this:我的 systemd 服务单元文件是这样的:
[Unit]
Description=Collector service
After=network.target mariadb.service
Requires=mariadb.service
[Service]
Type=simple
ExecStart=/opt/collector/app.py
WorkingDirectory=/opt/collector
Restart=on-abort
User=root
[Install]
WantedBy=multi-user.target
I am using Type=simple
since this looks like the correct solution (also the preferred one in this Question ).我正在使用
Type=simple
因为这看起来是正确的解决方案(也是本问题中的首选解决方案)。
I tried using Type=oneshot
as well (as suggested by the initial user making this question as duplicate of this question ) but the problem is that the /opt/collector/app.py script is a long running process:我尝试使用
Type=oneshot
以及(由初始用户使这个问题进行重复的所建议的这个问题),但问题是,/opt/collector/app.py脚本是一个长期运行的进程:
while True:
t = threading.Thread(...)
t.start()
t.join()
time.sleep(15)
and with Type=oneshot
, Ansible will block forever .并且使用
Type=oneshot
, Ansible 将永远阻塞。
And my Ansible starting code is:我的 Ansible 起始代码是:
- name: start Collector service
systemd:
name: collector
state: started
enabled: yes
On the target system, systemctl
will display:在目标系统上,
systemctl
将显示:
[root@srv01 /]# systemctl
UNIT LOAD ACTIVE SUB DESCRIPTION
dev-sda1.device loaded activating tentative /dev/sda1
-.mount loaded active mounted /
dev-mqueue.mount loaded active mounted POSIX Message Queue File System
etc-hostname.mount loaded active mounted /etc/hostname
etc-hosts.mount loaded active mounted /etc/hosts
etc-resolv.conf.mount loaded active mounted /etc/resolv.conf
run-user-0.mount loaded active mounted /run/user/0
session-73.scope loaded active running Session 73 of user root
crond.service loaded active running Command Scheduler
dbus.service loaded active running D-Bus System Message Bus
haproxy.service loaded active running HAProxy Load Balancer
<E2><97><8F> collector.service loaded failed failed Collector service
....
The service fails because of the Python process exception (using un undefined variable).由于 Python 进程异常(使用未定义的变量),服务失败。
But my Ansible playbook run does not fail:但是我的 Ansible playbook 运行并没有失败:
TASK [inventory : start Collector service] *********************************
changed: [srv01]
I tried with both systemd
and service
Ansible modules and the behavior is the same.我尝试了
systemd
和service
Ansible 模块,行为是相同的。
How can I make Ansible:我怎样才能使 Ansible:
active running
status with a while True
process?active running
状态,一段while True
进程?I stumbled over this while I had the same problem with silently failing serives.我偶然发现了这一点,而我遇到了同样的问题,默默地失败了服务。 I also found a bug report describing this issue and after some research I found a workaround:
我还发现了一个描述这个问题的错误报告,经过一些研究,我找到了一个解决方法:
- name: start Collector service
systemd:
name: collector
state: started
enabled: yes
- name: make sure Collector service is really running
command: systemctl is-active collector
Note that for Type=simple
services this will only fail if the service itself fails immediately after it was started.请注意,对于
Type=simple
服务,仅当服务本身在启动后立即失败时才会失败。
You can use failed_when
example:您可以使用
failed_when
示例:
- name: validating processes started correctly
shell: pgrep toto| wc -l
register: after_count
failed_when: after_count.stdout_lines[0] == "1"
The failed_when
will fail a task if the number of processes returned is not == 1
如果返回的进程数不是
== 1
则failed_when
将使任务失败
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.