Kubernetes CoreDNS间歇性地解析名称

Question

I've got a two node Kubernetes EKS cluster which is running "v1.12.6-eks-d69f1" 我有一个双节点Kubernetes EKS集群运行“v1.12.6-eks-d69f1”

Amazon VPC CNI Plugin for Kubernetes version: amazon-k8s-cni:v1.4.1
CoreDNS version: v1.1.3
KubeProxy: v1.12.6

There are two CoreDNS pods running on the cluster. 群集上运行了两个CoreDNS pod。

The problem I have is that my pods are resolving internal DNS names intermittently. 我遇到的问题是我的pod正在间歇性地解析内部DNS名称。 (Resolution of external DNS names work just fine) （外部DNS名称解析工作正常）

root@examplecontainer:/# curl http://elasticsearch-dev.internaldomain.local:9200/
curl: (6) Could not resolve host: elasticsearch-dev.internaldomain.local

elasticsearch-dev.internaldomain.local is registered on an AWS Route53 Internal Hosted Zone. elasticsearch-dev.internaldomain.local在AWS Route53内部托管区域注册。 The above works intermittenly, if I fire five requests, two of them would resolve correctly and the rest would fail. 以上工作间歇性地，如果我发出五个请求，其中两个将正确解析，其余的将失败。

These are the contents of the /etc/resolv.conf file on the examplecontainer above: 这些是上面examplecontainer上的/etc/resolv.conf文件的内容：

root@examplecontainer:/# cat /etc/resolv.conf 
nameserver 172.20.0.10
search default.svc.cluster.local svc.cluster.local cluster.local eu-central-1.compute.internal
options ndots:5

Any ideas why this might be happening? 任何想法为什么会这样？

Answer 1

you should try below dns from container 你应该尝试从容器下面的dns

curl http://elasticsearch-dev.default.svc.cluster.local:9200/ 卷曲http：//elasticsearch-dev.default.svc.cluster.local：9200 /

Answer 2

pleae take a look for this "Enabling DNS resolution for Amazon EKS cluster endpoints" here . 请在此处查看 “为Amazon EKS群集端点启用DNS解析”。

The Amazon Route 53 private hosted zone that is created for the endpoint is only associated with the worker node VPC. 为端点创建的Amazon Route 53专用托管区域仅与工作节点VPC相关联。

If it's similar toy your env. 如果它是类似的玩具你的环境。 you can find solution here . 你可以在这里找到解决方案

Please share with the results. 请与结果分享。

Answer 3

I fixed this issue by switching from a custom "DHCP option set" to the default "DHCP option set" provided by AWS. 我通过从自定义的“DHCP选项集”切换到AWS提供的默认“DHCP选项集”来修复此问题。 I created the custom "DHCP option set" months ago and assigned it to the VPC where the EKS cluster is running... 我在几个月前创建了自定义“DHCP选项集”，并将其分配给运行EKS集群的VPC ...

How did I get to the bottom of this? 我是如何找到底线的？

After running "kubectl get events -n kube-system", I realised of the following: 运行“kubectl get events -n kube-system”之后，我意识到以下几点：

Warning  DNSConfigForming  17s (x15 over 14m)  kubelet, ip-10-4-9-155.us-west-1.compute.internal  Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.4.8.2 8.8.8.8 8.8.4.4

8.8.8.8 and 8.8.4.4 were injected by the troublesome "DHCP options set" that I created. 8.8.8.8和8.8.4.4是由我创建的麻烦的“DHCP选项集”注入的。 And I think that the reason why my services where resolving internal DNS names intermittently was because the CoreDNS service was internally forwarding DNS requests to 10.4.8.2, 8.8.4.4, 8.8.8.8 in a round robin fashion. 我认为我的服务间歇性地解析内部DNS名称的原因是因为CoreDNS服务在内部以循环方式将DNS请求转发到10.4.8.2,8.8.4.4,8.8.8.8。 Since the last 2 DNS servers don't know about my Route53 internal hosted zone DNS records, the resolution failed intermittently. 由于最后2个DNS服务器不知道我的Route53内部托管区域DNS记录，因此解决方案间歇性失败。

Note 10.4.8.2 is the default AWS nameserver. 注意10.4.8.2是默认的AWS名称服务器。

As soon as switch to the default "DHCP option set" provided by AWS, the EKS services can resolve my internal DNS names consistently. 只要切换到AWS提供的默认“DHCP选项集”，EKS服务就可以一致地解析我的内部DNS名称。

I hope this will help someone in the future. 我希望这将有助于将来的某些人。

Kubernetes CoreDNS间歇性地解析名称

问题描述

3 个解决方案

解决方案1
0 2019-05-13 10:04:01

解决方案2
0 2019-05-13 13:01:07

解决方案3
0 已采纳 2019-05-23 13:02:09

Kubernetes CoreDNS间歇性地解析名称

问题描述

3 个解决方案

解决方案1 0 2019-05-13 10:04:01

解决方案2 0 2019-05-13 13:01:07

解决方案3 0 已采纳 2019-05-23 13:02:09

解决方案1
0 2019-05-13 10:04:01

解决方案2
0 2019-05-13 13:01:07

解决方案3
0 已采纳 2019-05-23 13:02:09