简体   繁体   中英

kube-dns cannot resolve domain name

After installing just the basic Kubernetes packages and working with minikube, I have started just the basic kube-system pods. I'm trying to investigate why the kube-dns is not able to resolve domain names

Here are the versions I'm using

Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

minikube version: v0.28.2

Kubectl:

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Kubeadm:

kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

VirtualBox:

Version 5.2.18 r124319 (Qt5.6.2)

Here are the system pods I have deployed:

NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
default       busybox                                 1/1       Running   0          31m
kube-system   etcd-minikube                           1/1       Running   0          32m
kube-system   kube-addon-manager-minikube             1/1       Running   0          33m
kube-system   kube-apiserver-minikube                 1/1       Running   0          33m
kube-system   kube-controller-manager-minikube        1/1       Running   0          33m
kube-system   kube-dns-86f4d74b45-xjfmv               3/3       Running   2          33m
kube-system   kube-proxy-2kkzk                        1/1       Running   0          33m
kube-system   kube-scheduler-minikube                 1/1       Running   0          33m
kube-system   kubernetes-dashboard-5498ccf677-pz87g   1/1       Running   0          33m
kube-system   storage-provisioner                     1/1       Running   0          33m

I've also deployed busybox to allow me to execute commands inside the containers

kubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local mapleworks.com
options ndots:5

and

kubectl exec busybox nslookup google.com
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'google.com'
command terminated with exit code 1

The same commands run on the VM itself yield the following:

cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search mapleworks.com  <<< OUR local DNS server

nslookup google.com
Server:     127.0.1.1
Address:    127.0.1.1#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.13.174

Questions: kube-dns is using the default nameserver 10.96.0.10 whereas I would have expected the VM nameserver would have been imported into kubernetes.

While this same nameserver deployed on a native Windows or Mac platform is able to properly resolve domain names, this VM has an issue with it.

Is this some sort of Firewall issue as I've seen mentioned in some other posts?

I have inspected the kube-dns container logs but the most relevant are from the sidecar container.

I0910 15:47:17.667100       1 main.go:51] Version v1.14.8
I0910 15:47:17.667195       1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0910 15:47:17.667240       1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
I0910 15:47:17.668244       1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
W0910 15:50:04.780281       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:34535->127.0.0.1:53: i/o timeout
W0910 15:50:11.781236       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:50887->127.0.0.1:53: i/o timeout
W0910 15:50:24.844065       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:52865->127.0.0.1:53: i/o timeout
W0910 15:50:31.845587       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42053->127.0.0.1:53: i/o timeout

The i/o timeouts correspond to the manual DNS query I've performed on google.com, I think

Otherwise I see here the localhost address and the port 53

I just don't know what is going on...

Each kubelet in a k8s cluster has --cluster-dns option. This option, in fact, provides a Service name for kube-dns Deployment . Each kube-dns Pod, in turn, has dnsmasq container, which is using a list of nameservers from the k8s node. You can check it in dnsmasq container's logs:

I0720 03:49:51.081031       1 nanny.go:116] dnsmasq[13]: reading /etc/resolv.conf
I0720 03:49:51.081068       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0720 03:49:51.081099       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0720 03:49:51.081130       1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0720 03:49:51.081160       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_1>#53
I0720 03:49:51.081190       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_2>#53
I0720 03:49:51.081222       1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_N>#53

When any Pod is created, by default, it has got nameserver <CLUSTER_DNS_IP> entry in /etc/resolve.conf . That's how any Pod can (or cannot) resolve certain domain name - through the kube-dns service.

For example, my cluster-dns is 10.233.0.3:

$ kubectl -n test run -it --image=alpine:3.6 alpine -- sh                                                                      
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf 
nameserver 10.233.0.3
search test.svc.cluster.local svc.cluster.local cluster.local test.kz
/ # nslookup kubernetes-charts.storage.googleapis.com 10.233.0.3
Server:    10.233.0.3
Address 1: 10.233.0.3 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes-charts.storage.googleapis.com
Address 1: 74.125.131.128 lu-in-f128.1e100.net
Address 2: 2a00:1450:4010:c05::80 li-in-x80.1e100.net

So, if a Node (where the kube-dns is scheduled to) can resolve certain domain name, then any Pod can do the same.

Check the ConfigMap for your kube-dns server. Do you have upstreamNameservers: | configured? More info: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/

I managed to resolve this on with help from a colleague. It turns out that there is a difference in terms of Desktop and Server versions of Ubuntu installations On a server the /etc/network/interface lists the primary interface which blocks a Network/Manager process from running a local dnsmasq service from running

When I added the following lines to this file on a Desktop installation:

#Primary Network Interfaces
auto enp0s3
iface enp0s3 inet dhcp

then the kube-dnsmasq was passed the upstream nameserver address and was then able to resolve any DNS requests

Here is an example of the Network Manager processes running after the change

gilles@gilles-VirtualBox:~$ ps -ef | grep Network
root       870     1  0 16:52 ?        00:00:00 /usr/sbin/NetworkManager --no-daemon
gilles    6991  5316  0 16:55 pts/17   00:00:00 grep --color=auto Network

Here is an example of the dnsmasq container logs after the change:

I0911 20:52:47.878050       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0911 20:52:47.878063       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0911 20:52:47.878070       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0911 20:52:47.878080       1 nanny.go:116] dnsmasq[10]: reading /etc/resolv.conf
I0911 20:52:47.878086       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa 
I0911 20:52:47.878092       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa 
I0911 20:52:47.878097       1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local 
I0911 20:52:47.878103       1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.3#53
I0911 20:52:47.878109       1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.4#53

The last two lines were present only after the change

And then

kubectl exec busybox -- nslookup google.com
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2607:f8b0:4020:804::200e yul02s04-in-x0e.1e100.net
Address 2: 172.217.13.110 yul02s04-in-f14.1e100.net

I hope this can be of value to others

This seems like an issue in kube-dns connection to the local host DNS. You can manually configure the kube-dns not to use the local host DNS, but instead to directly access the external DNS server.

Edit the CoreDNS configuration:

kubectl -n kube-system edit configmap coredns

Change the line:

 forward . /etc/resolve.conf {

to:

 forward . 8.8.8.8 {

Restart the CoreDNS pods:

kubectl --namespace=kube-system delete pod -l k8s-app=kube-dns 

For more details, see the post https://runkiss.blogspot.com/2021/01/kubernetes-coredns-external-resolving.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM