[英]keepalived transitions not happening as expected
I am trying to implement keepalived based failover for my service. 我正在尝试为我的服务实施基于keepalived的故障转移。 Please find below my configurations for the master and backup nodes.
请在下面找到我的主节点和备份节点的配置。
Master node: 主节点:
vrrp_script chk_splunkd {
script "pidof splunkd"
interval 2
fall 2
rise 2
}
vrrp_instance VI_1 {
interface eth0
state MASTER
advert_int 1
virtual_router_id 51
priority 200
nopreempt
smtp_alert
authentication {
auth_type PASS
auth_pass passme
}
virtual_ipaddress {
10.126.246.245
}
track_script {
chk_splunkd
}
notify_master /etc/keepalived/scripts/master.sh
notify_backup /etc/keepalived/scripts/stop_service.sh
notify_fault /etc/keepalived/scripts/stop_service.sh
}
Back up node: 备份节点:
vrrp_script chk_splunkd {
script "pidof splunkd"
interval 2
fall 2
rise 2
}
vrrp_instance VI_1 {
interface eth0
state BACKUP
advert_int 1
virtual_router_id 51
priority 100
nopreempt
smtp_alert
authentication {
auth_type PASS
auth_pass passme
}
virtual_ipaddress {
10.126.246.245
}
track_script {
chk_splunkd
}
notify_master /etc/keepalived/scripts/master.sh
notify_backup /etc/keepalived/scripts/stop_service.sh
notify_fault /etc/keepalived/scripts/stop_service.sh
}
However, I find that even when one node goes into fault state and stops sending VRRP advertisements, the other node doesn't automatically transition to master state. 但是,我发现即使一个节点进入故障状态并停止发送VRRP通告,另一节点也不会自动过渡到主状态。 When I tried to monitor the VRRP advertisement packets using
tcpdump -vv -i eth0 vrrp
I find that even after the advertisement from one node stops, the other node doesn't automatically start sending the advertisements indicating that it has now become the master. 当我尝试使用
tcpdump -vv -i eth0 vrrp
监视VRRP通告数据包时,我发现即使来自一个节点的通告停止了,另一个节点也不会自动开始发送通告,表明它已经成为主节点。
Please help me find out what I'm missing. 请帮助我找出我所缺少的。
Thanks, 谢谢,
Keerthana Keerthana
The issue was that during startup when one node became the master, the other one went into fault mode due to the pidof splunkd
command which will return 1 as my splunk service should be up on only the master node. 问题在于,在启动过程中,当一个节点成为主节点时,另一个节点由于
pidof splunkd
命令而进入故障模式,该命令将返回1,因为我的splunk服务应仅在主节点上启动。 Once I edited the notify script to write current state to an external file and read the state to take action in my notify scripts, things started working fine. 一旦我编辑了通知脚本以将当前状态写入外部文件并读取该状态以在通知脚本中采取措施,一切就开始正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.