# coredns service 连接超时(connection timed out; no servers could be reached) ## node 主机测试kube-dns 解析提示连接超时 ``` dig @10.96.0.10 kube-dns.kube-system.svc.cluster.local # 显示如下 ; <<>> DiG 9.11.26-RedHat-9.11.26-6.el8 <<>> @10.96.0.10 kube-dns.kube-system.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; connection timed out; no servers could be reached ``` ## 修改 coredns configmap 配置，查看coredns 日志 ``` kubectl -n kube-system edit cm coredns ``` 在 Corefile 增加`log` 配置 ```yaml apiVersion: v1 data: Corefile: | .:53 { log errors health { lameduck 5s } ... ``` 保存更改后，k8s 将这些更改传播到CoreDNS豆荚可能需要一到两分钟的时间。CoreDNS会自动加载配置。查询coredns 所有pod日志 ``` kubectl logs coredns-5f9bcf7c57-hbq5z -n kube-system --tail 100 ``` 显示类似如下内容，没有异常信息 ``` [INFO] 10.128.8.110:40481 - 26090 "A IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local.svc.cluster.local. udp 91 false 512" NXDOMAIN qr,aa,rd 184 0.000058542s [INFO] 10.128.8.110:33032 - 61678 "A IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local. udp 73 false 512" NOERROR qr,aa,rd 144 0.000067122s [INFO] 10.128.3.156:54546 - 33121 "A IN dev-mgo-nft-mongodb.nft.svc.nft.svc.cluster.local. udp 67 false 512" NXDOMAIN qr,aa,rd 160 0.000404809s [INFO] 10.128.3.156:35509 - 17309 "A IN dev-mgo-nft-mongodb.nft.svc.svc.cluster.local. udp 63 false 512" NXDOMAIN qr,aa,rd 156 0.000294607s [INFO] 10.128.3.156:41908 - 8575 "AAAA IN dev-mgo-nft-mongodb.nft.svc.cluster.local. udp 59 false 512" NOERROR qr,aa,rd 152 0.000117004s [INFO] 10.128.8.110:59090 - 31404 "AAAA IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local.svc.cluster.local. udp 91 false 512" NXDOMAIN qr,aa,rd 184 0.000143273s [INFO] 10.128.8.110:45311 - 21795 "AAAA IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local. udp 73 false 512" NOERROR qr,aa,rd 166 0.000079302s [INFO] 10.128.8.110:42297 - 55594 "A IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local.svc.cluster.local. udp 91 false 512" NXDOMAIN qr,aa,rd 184 0.000044621s [INFO] 10.128.8.110:58692 - 11604 "A IN kafka-zk-zookeeper-headless.zookeeper.svc.cluster.local. udp 73 false 512" NOERROR qr,aa,rd 144 0.000041031s ``` ## 查看 svc 和 endpoints信息 ``` # kubectl -n kube-system get endpoints kube-dns NAME ENDPOINTS AGE kube-dns 10.128.0.58:53,10.128.6.152:53,10.128.0.58:53 + 3 more... 15h [root@master1 ~]# kubectl -n kube-system describe endpoints kube-dns Name: kube-dns Namespace: kube-system Labels: k8s-app=kube-dns kubernetes.io/cluster-service=true kubernetes.io/name=CoreDNS Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2022-11-24T10:57:23Z Subsets: Addresses: 10.128.0.58,10.128.6.152 NotReadyAddresses: Ports: Name Port Protocol ---- ---- -------- dns-tcp 53 TCP dns 53 UDP metrics 9153 TCP Events: ``` 使用dig 测试 pod ip ,查看解析 ``` dig @10.128.0.58 kube-dns.kube-system.svc.cluster.local dig @10.128.6.152 kube-dns.kube-system.svc.cluster.local ``` 结果都是正常的, 说明coredns 服务本身没有问题 ``` ; <<>> DiG 9.11.26-RedHat-9.11.26-6.el8 <<>> @10.128.0.58 kube-dns.kube-system.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41299 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 75dd2e6ddd4a4454 (echoed) ;; QUESTION SECTION: ;kube-dns.kube-system.svc.cluster.local. IN A ;; ANSWER SECTION: kube-dns.kube-system.svc.cluster.local. 30 IN A 10.96.0.10 ;; Query time: 0 msec ;; SERVER: 10.128.0.58#53(10.128.0.58) ;; WHEN: Fri Nov 25 10:18:28 CST 2022 ;; MSG SIZE rcvd: 133 ``` ## telnet 测试连通性使用telnet 在master 主机和 node 测试 10.96.0.10 53 端口，结果都是正常的 ``` telnet 10.96.0.10 53 # 显示如下 Trying 10.96.0.10... Connected to 10.96.0.10. Escape character is '^]'. Connection closed by foreign host. ``` ## 根据网上搜索相关错误联系刚升级的flannel服务配置解决问题根据中信息提示，pod 网络和服务网络重叠会导致路由问题。我猜测可能跟我刚更新的 cni flannel 版本到v0.20.1 有关系 flannel Network 配置如下 ```yaml net-conf.json: | { "Network": "10.128.0.0/16", "Backend": { "Type": "vxlan" } } ``` Kubeadm-config 中Networking配置如下 ```yaml networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 podSubnet: 10.128.0.0/16 ``` Kubeadm-config 中 serviceSubnet 不在 flannel Network 配置范围中，我修改了 flannel Network 配置如下 ```yaml net-conf.json: | { "Network": "10.0.0.0/8", "Backend": { "Type": "vxlan" } } ``` 重启flannel ds ``` kubectl rollout restart ds kube-flannel-ds -n kube-flannel ``` 重新使用dig 测试 kube-dns 域名 ``` dig @10.96.0.10 kube-dns.kube-system.svc.cluster.local ``` 可以正常解析 ``` ; <<>> DiG 9.11.26-RedHat-9.11.26-6.el8 <<>> @10.96.0.10 kube-dns.kube-system.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20581 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: a7918abdce60333a (echoed) ;; QUESTION SECTION: ;kube-dns.kube-system.svc.cluster.local. IN A ;; ANSWER SECTION: kube-dns.kube-system.svc.cluster.local. 30 IN A 10.96.0.10 ;; Query time: 0 msec ;; SERVER: 10.96.0.10#53(10.96.0.10) ;; WHEN: Fri Nov 25 10:33:40 CST 2022 ;; MSG SIZE rcvd: 133 ``` ## 参考： --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://liujinye.gitbook.io/openshift-docs/troubleshooting/coredns-service-lian-jie-chao-shi-connection-timed-out-no-servers-could-be-reached.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.