简体   繁体   English

Kubernetes:Nginx入口控制器无法正常启动

[英]Kubernetes: Nginx Ingress Controller does not boot properly

I've been setting up a small kubernetes cluster. 我一直在设置一个小的kubernetes集群。 I have 3 CentOS VMs, one master and 2 minions. 我有3个CentOS VM,1个master和2个奴才。 Kubernetes runs in docker containers. Kubernetes在Docker容器中运行。 I set it up with help of the following 2 articles: 我在以下2篇文章的帮助下进行了设置:

Now I'm trying to install the nginx ingress controller. 现在,我正在尝试安装nginx入口控制器。 I work with github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx on revision 6c87fed (I also tried the tags 0.6.0 and 0.6.3 - same behavior). 我在版本6c87fed上使用github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx工作(我也尝试了标签0.6.0和0.6.3-相同的行为)。

I run the following commands according to the README.md from the above link: 我从上面的链接根据README.md运行以下命令:

kubectl create -f examples/default-backend.yaml
kubectl expose rc default-http-backend --port=80 --target-port=8080 --name=default-http-backend
kubectl create -f examples/default/rc-default.yaml

Now the pod for the ingress controller comes up properly at first but fails after about 30 seconds or so. 现在,用于入口控制器的Pod首先会正确启动,但在大约30秒后会失效。 The log says 日志说

kubectl logs nginx-ingress-controller-ttylt
I0615 11:21:20.641306       1 main.go:96] Using build: https://github.com/bprashanth/contrib.git - git-afb16a4
F0615 11:21:50.643748       1 main.go:125] unexpected error getting runtime information: timed out waiting for the condition

Sounds like its trying to connect to a nonexistent host or so. 听起来好像它试图连接到不存在的主机。 Any ideas what I can check or how to fix it? 有什么想法可以检查或如何解决吗?

Regards 问候

Edit: As this seems to be a common problem: I should add that I checked for port 80 and 443 to be available on the nodes. 编辑:因为这似乎是一个常见问题:我应该补充一点,我检查端口80和443在节点上是否可用。

I did not find a solution for the nginx ingress controller - maybe it's just broken at the moment. 我没有为Nginx Ingress控制器找到解决方案-也许此刻它已经坏了。

Though I did 2 things to achieve my initial goal (have an ingress controller): 尽管我做了两件事以实现我的最初目标(拥有一个入口控制器):

1 start kube-proxy with --proxy-mode=userspace as the default proxy mode does not work on the CentOS version I use (CentOS Linux release 7.2.1511 (Core)). 1使用--proxy-mode = userspace启动kube-proxy,因为默认代理模式不适用于我使用的CentOS版本(CentOS Linux版本7.2.1511(核心))。

2 I'm using. 2我正在使用。 use traefik with 与traefik一起使用

docker run -d -p 1080:80 traefik \
    --kubernetes \
    --kubernetes.endpoint=http://my.kubernetes.master:8080

Note that my.kubernetes.master is the public ip of the kubernetes master - ie not a cluster ip but a real ip on a real network interface. 请注意,my.kubernetes.master是kubernetes主服务器的公共IP,即不是群集IP,而是真实网络接口上的真实IP。

The endpoint I use is due to traefik having problems with the ca certificate on the default endpoint. 我使用的端点是由于traefik在默认端点上的ca证书有问题。 That's not a clean solution though it's ok for my proof of concept. 尽管对于我的概念验证是可以的,但这不是一个干净的解决方案。

The reason for this is actually obscured by the error message. 错误消息实际上掩盖了其原因。 As far as I've been able to determine using strace and what not, the underlying error is that the TLS handshake fails. 就我能够确定使用strace以及不使用strace ,根本的错误是TLS握手失败。 The ingress controller will repeatedly try to connect to the master on port 443, which will fail because it doesn't present a correct TLS certificate. 入口控制器将反复尝试通过端口443连接到主服务器,这将失败,因为它没有提供正确的TLS证书。

If you look in kube-api-server.log, you'll likely find a bunch of these: 如果查看kube-api-server.log,可能会发现很多这样的东西:

I0705 04:16:17.150073    9521 logs.go:41] http: TLS handshake error from 172.20.1.3:39354: remote error: bad certificate

I've not been yet able to figure out a solution. 我尚未能找到解决方案。 However, a got a little further: I tried starting the API server with --kubelet-client-certificate , --kubelet-client-private-key and --kubelet-certificate-authority , and then starting Kubelet with the TLS options pointing at the same files, at which point the Nginx controller failed with new error, this time about the cert name not matching. 但是,还有一点:我尝试使用--kubelet-client-certificate ,-- --kubelet-client-private-key--kubelet-certificate-authority启动API服务器,然后使用TLS选项指向来启动Kubelet在同一文件中,此时Nginx控制器由于新错误而失败,这一次是关于证书名称不匹配的。 I believe that if you generate the right cert on each worker node, with the right IP address, it will work. 我相信,如果您在每个工作节点上使用正确的IP地址生成正确的证书,它将起作用。

Edit : I found the solution. 编辑 :我找到了解决方案。 First of all, the Kubelet needs a kubeconfig file. 首先,Kubelet需要一个kubeconfig文件。 It needs to point to the CA cert as well as its own cert/key pair, which we'll call kubelet.crt and kubelet.key . 它需要指向CA证书以及它自己的证书/密钥对,我们将其称为kubelet.crtkubelet.key When you generate these files, you need to explicitly list not just the IP of the master, but also the cluster IP of the master. 生成这些文件时,不仅需要显式列出主服务器的IP,还需要显式列出主服务器的群集IP Why? 为什么? Because that's the IP that it talks to. 因为那是它要与之对话的IP。

So when I generated the certs for Kubernetes, I used (via Google's patched version of EasyRSA ): 因此,当我为Kubernetes生成证书时,我使用了(通过Google的EasyRSA修补版本 ):

easyrsa --batch "--req-cn=${public_ip}@`date +%s`" build-ca nopass
easyrsa --subject-alt-name="IP:${public_ip},IP:${private_ip},IP:172.16.0.1,DNS:kubernetes.default,DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster.local,DNS:kubernetes-master" build-server-full kubernetes-master nopass
easyrsa build-client-full kubelet nopass
easyrsa build-client-full kubecfg nopass

Now you'll end up with pki/ca.crt , pki/issued/kubernetes-master.crt , pki/private/kubernetes-master.key , pki/issued/kubelet.crt , pki/private/kubelet.key , pki/issued/kubecfg.crt and pki/private/kubecfg.key . 现在您将得到pki/ca.crtpki/issued/kubernetes-master.crtpki/private/kubernetes-master.keypki/issued/kubelet.crtpki/private/kubelet.keypki/issued/kubecfg.crtpki/private/kubecfg.key The kube-apiserver must be started with: kube-apiserver必须以以下内容启动:

--client-ca-file=/srv/kubernetes/ca.crt
--tls-cert-file=/srv/kubernetes/kubernetes-master.crt
--tls-private-key-file=/srv/kubernetes/kubernetes-master.key

And you need to create /var/lib/kubelet/kubeconfig that points to kubelet.crt , kubelet.key and ca.crt according to the docs . 根据文档 ,您需要创建/var/lib/kubelet/kubeconfig指向kubelet.crtkubelet.keyca.crt

We bump into this issue too and fixed it putting 'nginx-ingress-controller' and 'default-http-backend' into kube-system namespace. 我们也遇到了这个问题,并修复了将“ nginx-ingress-controller”和“ default-http-backend”放入kube-system命名空间的问题。 I think the issue is that ingress-controller doesn't have access to API server in the another namespaces. 我认为问题在于入口控制器无法访问其他命名空间中的API服务器。 Try it. 试试吧。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM