简体   繁体   中英

GKE DNS resolution errors

We use Kubernetes cronjobs on GKE (version 1.9) for running several periodic tasks. From the pods, we need to make several calls to external API outside our network. Often (but not all the time), these calls fail because of DNS resolution timeouts.

The current hypothesis I have is that the upstream DNS server for the service we are trying to contact is rate limiting the requests where we make lots of repeated DNS requests because the TTL for those records was either too low or just because we dropped those entries from dnsmasq cache due to low cache size.

I tried editing the kube-dns deployment to change the cache size and ttl arguments passed to dnsmasq container, but the changes get reverted because it's a managed deployment by GKE. Is there a way to persist these changes so that GKE does not overwrite them? Any other ideas to deal with dns issues on GKE or Kubernetes engine in general?

Not sure if all knobs are covered, but if you update the ConfigMap used by the deployment you should be able to reconfigure KubeDNS on GKE. It will use the ConfigMap when deploying new instances. Then nuke the existing pods to redeploy them with the new config.

我建议您使用像KubeDNS这样的ExternalDNS Pod,它从Kubernetes API检索资源列表(服务,入口等),以确定所需的DNS记录列表。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM