This page shows you how to resolve issues related to load balancing in Google Kubernetes Engine (GKE) clusters using Service, Ingress, or Gateway resources.
External Ingress produces HTTP 502 errors
Use the following guidance to troubleshoot HTTP 502 errors with external Ingress resources:
- Enable logs for each backend service associated with each GKE Service that is referenced by the Ingress.
- Use status details to identify causes for HTTP 502 responses. Status details that indicate the HTTP 502 response originated from the backend require troubleshooting within the serving Pods, not the load balancer.
Unmanaged instance groups
You might experience HTTP 502 errors with external Ingress resources if your external Ingress uses unmanaged instance group backends. This issue occurs when all of the following conditions are met:
- The cluster has a large total number of nodes among all node pools.
- The serving Pods for one or more Services that are referenced by the Ingress are located on only a few nodes.
- Services referenced by the Ingress use
externalTrafficPolicy: Local
.
To determine if your external Ingress uses unmanaged instance group backends, do the following:
Go to the Ingress page in the Trusted Cloud console.
Click the name of your external Ingress.
Click the name of the Load balancer. The Load balancing details page displays.
Check the table in the Backend services section to determine if your external Ingress uses NEGs or instance groups.
To resolve this issue, use one of the following solutions:
- Use a VPC-native cluster.
- Use
externalTrafficPolicy: Cluster
for each Service referenced by the external Ingress. This solution causes you to lose the original client IP address in the packet's sources. - Use the
node.kubernetes.io/exclude-from-external-load-balancers=true
annotation. Add the annotation to the nodes or node pools that don't run any serving Pod for any Service referenced by any external Ingress orLoadBalancer
Service in your cluster.
Use load balancer logs to troubleshoot
You can use internal passthrough Network Load Balancer logs and external passthrough Network Load Balancer logs to troubleshoot issues with load balancers and correlate traffic from load balancers to GKE resources.
Logs are aggregated per-connection and exported in near real time. Logs are generated for each GKE node involved in the data path of a LoadBalancer Service, for both ingress and egress traffic. Log entries include additional fields for GKE resources, such as:
- Cluster name
- Cluster location
- Service name
- Service namespace
- Pod name
- Pod namespace
Use diagnostic tools to troubleshoot
The check-gke-ingress
diagnostic tool inspects Ingress resources for common
misconfigurations. You can use the check-gke-ingress
tool in the following
ways:
- Run the
gcpdiag
command-line tool on your cluster. Ingress results appear in the check rulegke/ERR/2023_004
section. - Use the
check-gke-ingress
tool alone or as a kubectl plugin by following the instructions in check-gke-ingress.