排查 GKE 中的 DNS 问题

本页面介绍如何解决与 Google Kubernetes Engine (GKE) 集群中的 DNS 提供商相关的问题。

Cloud DNS for GKE 事件

本部分详细介绍 GKE 中的常见 Cloud DNS 问题。

Cloud DNS 被停用

Cloud DNS API 被停用时会发生以下事件:

Warning   FailedPrecondition        service/default-http-backend
Failed to send requests to Cloud DNS: Cloud DNS API Disabled. Please enable the Cloud DNS API in your project PROJECT_NAME: Cloud DNS API has not been used in project PROJECT_NUMBER before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/dns.googleapis.com/overview?project=PROJECT_NUMBER then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.

发生此错误是因为 Cloud DNS API 默认未启用。您必须手动启用 Cloud DNS API。

如需解决此问题,请启用 Cloud DNS API

未能向 Cloud DNS 发送请求:超出了 API 速率限制。

当项目超过 Cloud DNS 配额或限制时,会发生以下事件:

kube-system   27s         Warning   InsufficientQuota
managedzone/gke-cluster-quota-ee1bd2ca-dns     Failed to send requests to Cloud DNS: API rate limit exceeded. Contact Google Cloud support team to request a quota increase for your project PROJECT_NAME: Quota exceeded for quota metric 'Write requests' and limit 'Write limit for a minute for a region' of service 'dns.googleapis.com' for consumer 'project_number:PROJECT_NUMBER.

如需解决此问题,请查看 Cloud DNS 配额以及 Compute Engine 配额和限制。您可以使用 Google Cloud 控制台来增加配额。

由于之前的错误,未能向 Cloud DNS 发送请求

当错误导致级联故障时,会发生以下事件:

kube-system   27s         Warning   InsufficientQuota
managedzone/gke-cluster-quota-ee1bd2ca-dns     Failed to send requests to Cloud DNS: API rate limit exceeded. Contact Google Cloud support team to request a quota increase for your project PROJECT_NAME: Quota exceeded for quota metric 'Write requests' and limit 'Write limit for a minute for a region' of service 'dns.googleapis.com' for consumer 'project_number:PROJECT_NUMBER.
kube-system   27s         Warning   FailedPrecondition               service/default-http-backend                         Failed to send requests to Cloud DNS due to a previous error. Please check the cluster events.

要解决此问题,请检查集群事件以查找原始错误的根源,然后按照说明解决该根源问题。

在前面的示例中,托管式可用区的 InsufficientQuota 错误触发了级联故障。FailedPrecondition 的第二个错误表示发生了之前的错误,也就是初始配额不足问题。要解决此示例问题,您需要按照 Cloud DNS 配额错误排查指南操作。

后续步骤