简体   繁体   中英

multi master OKD-3.11 setup fails if master-1 nodes is down

I am trying to install multi-master openshift-3.11 setup in openstack VMs as per the inventory file present in the official documentation.

https://docs.openshift.com/container-platform/3.11/install/example_inventories.html#multi-masters-single-etcd-using-native-ha

OKD Version
[centos@master1 ~]$ oc version oc v3.11.0+62803d0-1 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://master1.167.254.204.74.nip.io:8443 openshift v3.11.0+ff2bdbd-531 kubernetes v1.11.0+d4cacc0
Steps To Reproduce

Bring up an okd-3.11 multi master setup as per the inventory file mentioned in here, https://docs.openshift.com/container-platform/3.11/install/example_inventories.html#multi-masters-single-etcd-using-native-ha

Current Result

The setup is successful but struck with two issues as mentioned below,

  1. unable to list down the load balancer nodes on issue of "oc get nodes" command.
 [centos@master1 ~]$ oc get nodes NAME STATUS ROLES AGE VERSION master1.167.254.204.74.nip.io Ready infra,master 6h v1.11.0+d4cacc0 master2.167.254.204.58.nip.io Ready infra,master 6h v1.11.0+d4cacc0 master3.167.254.204.59.nip.io Ready infra,master 6h v1.11.0+d4cacc0 node1.167.254.204.82.nip.io Ready compute 6h v1.11.0+d4cacc0
  1. The master nodes and the load balancer are totally dependent on master-1 node because if master-1 is down then rest of the master nodes or load balancer unable to run any of the oc commands,
 [centos@master2 ~]$ oc get nodes Unable to connect to the server: dial tcp 167.254.204.74:8443: connect: no route to host

The OKD setup works fine if the other master nodes (other than master-1) or the load balancer are down.

Expected Result

The OKD setup should be up & running though any one of the master nodes went down.

Inventory file:

 [OSEv3:children] masters nodes etcd lb [masters] master1.167.254.204.74.nip.io master2.167.254.204.58.nip.io master3.167.254.204.59.nip.io [etcd] master1.167.254.204.74.nip.io master2.167.254.204.58.nip.io master3.167.254.204.59.nip.io [lb] lb.167.254.204.111.nip.io [nodes] master1.167.254.204.74.nip.io openshift_ip=167.254.204.74 openshift_schedulable=true openshift_node_group_name='node-config-master' master2.167.254.204.58.nip.io openshift_ip=167.254.204.58 openshift_schedulable=true openshift_node_group_name='node-config-master' master3.167.254.204.59.nip.io openshift_ip=167.254.204.59 openshift_schedulable=true openshift_node_group_name='node-config-master' node1.167.254.204.82.nip.io openshift_ip=167.254.204.82 openshift_schedulable=true openshift_node_group_name='node-config-compute' [OSEv3:vars] debug_level=4 ansible_ssh_user=centos ansible_become=true ansible_ssh_common_args='-o StrictHostKeyChecking=no' openshift_enable_service_catalog=true ansible_service_broker_install=true openshift_node_groups=[{'name': 'node-config-master', 'labels': ['node-role.kubernetes.io/master=true', 'node-role.kubernetes.io/infra=true']}, {'name': 'node-config-compute', 'labels': ['node-role.kubernetes.io/compute=true']}] containerized=false os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant' openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability deployment_type=origin openshift_deployment_type=origin openshift_release=v3.11.0 openshift_pkg_version=-3.11.0 openshift_image_tag=v3.11.0 openshift_service_catalog_image_version=v3.11.0 template_service_broker_image_version=v3.11 osm_use_cockpit=true # put the router on dedicated infra1 node openshift_master_cluster_method=native openshift_master_default_subdomain=sub.master1.167.254.204.74.nip.io openshift_public_hostname=master1.167.254.204.74.nip.io openshift_master_cluster_hostname=master1.167.254.204.74.nip.io

Please let me know the entire setup dependency on master-node-1 and also any work around to fix this.

You should configure LB hostname to openshift_master_cluster_hostname and openshift_master_cluster_public_hostname , not master hostname. As your configuration, if you configure it as master1, then all API entrypoint will be master1, so if master1 stopped, then all API service would be down.

In advance you should configure your LB for loadbalancing to your master nodes, and register the LB IP(AKA VIP) to DNS as ocp-cluster.example.com . This hostname will be entrypoint for OCP API, you can set it using both openshift_master_cluster_hostname and openshift_master_cluster_public_hostname .

openshift_master_cluster_method=native
openshift_master_cluster_hostname=ocp-cluster.example.com
openshift_master_cluster_public_hostname=ocp-cluster.example.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM