GKE release notes (new features)

This page documents new features in Google Kubernetes Engine (GKE). You can periodically check this page for new feature announcements. The overall release notes also include the information in this page.

You can see the latest product updates for all of Trusted Cloud by S3NS on the Trusted Cloud page, browse and filter all release notes in the Trusted Cloud console, or programmatically access release notes in BigQuery.

To get the latest product updates delivered to you, add the URL of this page to your feed reader, or add the feed URL directly.

May 20, 2025

In GKE version 1.32.3-gke.1927002 and later, GKE uses a container-optimized compute platform for the general-purpose Autopilot compute class. This platform improves Pod scheduling latency, especially during autoscaling operations. The container-optimized compute platform provides benefits like faster scaling reaction times and more precise capacity right-sizing. For more information about the general-purpose compute class, see About built-in compute classes in Autopilot clusters.

May 13, 2025

GKE now provides insights and recommendations that help you to identify and troubleshoot clusters with Custom Resource Definitions that contain an invalid or malformed Certificate Authority bundle, which might disrupt cluster operations. Implementing the recommendation helps you to keep your clusters stable and performant.

May 12, 2025

In GKE version 1.33 and later, the Compute Engine persistent disk CSI Driver supports provisioning Hyperdisk Balanced High Availability volumes in the ReadWriteOnce, ReadWriteOncePod, and ReadWriteMany access modes. For more information, see Provisioning Hyperdisk Balanced High Availability volumes.

May 08, 2025

In GKE version 1.32 and later, GKE Sandbox (gVisor) can now be configured with SYS_ADMIN privileges in GKE Autopilot. This lets you use Docker-in-Docker with gVisor in GKE Autopilot.

April 03, 2025

GKE now provides insights and recommendations that help you identify workloads without resource requests or limits so that you can specify the resource needs for these workloads. Configuring CPU and memory requests and limits for containers is the best practice for improving reliability and performance, and is a necessary prerequisite for understanding and optimizing resource utilization by your workloads and their cost.

April 02, 2025

Automatic application monitoring is now generally available in GKE versions 1.28 and later. When configured on GKE clusters, this feature automatically collects key metrics with Google Cloud Managed Service for Prometheus and provides out-of-the-box dashboards for monitoring the supported workloads. Automatic application monitoring supports six new AI model servers (NVIDIA Triton, vLLM, TGI, JetStream, TorchServe and TensorFlow Serving). For more information, see Configure automatic application monitoring.

March 28, 2025

In version 1.32.1-gke.1729000 and later, you can customize specific kubelet and Linux kernel parameters like sysctls and huge pages by using the nodeSystemConfig field in your GKE compute classes. Additionally, you can now specify default values for fields that are omitted in individual rules in a compute class by using the priorityDefaults field. For details, see About custom compute classes.

March 21, 2025

In GKE version 1.32.2-gke.1652000 and later, new external LoadBalancer Services use zonal Network Endpoint Group (NEG) backends by default. This applies only to new backend service-based external LoadBalancer Services. Existing LoadBalancer Services are not affected. To learn more, see Create a backend service-based external load balancer.

Starting in GKE version 1.32.1-gke.1729000, Autopilot clusters will automatically use the new Performance HPA Profile. This new profile enables faster autoscaling on CPU and Memory metrics for up to 1,000 HorizontalPodAutoscaler objects by routing autoscaling metrics through the gke-metrics-agent Daemonset. If desired, users can revert to the old autoscaling profile by disabling the Peformance HPA Profile.

March 07, 2025

GKE now allows you to enable logging of Horizontal Pod Autoscaler decisions starting from GKE version 1.31.5-gke.1090000 or later, or version 1.32.1-gke.1260000 or later. These logs include atomic recommendations (based on individual metrics) and final recommendations (consolidated HPA decisions). The logs are stored in Cloud Logging and offer insights into the decision-making process of the Horizontal Pod Autoscaler.

You can now monitor startup latency of Kubernetes workloads and nodes using the new Startup Latency dashboard available in the Observability tab on the Deployment details and Cluster details pages in the GKE Console. The dashboard is useful for tracking, troubleshooting and optimizing startup latency of your GKE workloads.

March 04, 2025

The europe-north2 region in Stockholm, Sweden is now available. For more information, see the Global Locations.

February 28, 2025

New recommendations of NODE_SA_MISSING_PERMISSIONS subtype are added to the portfolio of GKE Recommendations. Use the new recommendations to identify clusters with node service accounts missing IAM permissions that are critical for normal cluster operations.

If your organization has a policy to disable automatic role grants to default service accounts, the created default GKE node service account will not get the necessary permissions. Missing critical permissions can degrade your essential cluster operations, such as logging and monitoring.

February 27, 2025

The GKE Autopilot partner program now lets partners create and manage allowlists that correspond to specific partner workloads. In GKE version 1.32.1-gke.1729000 and later, you can explicitly install allowlists in your clusters to run only the partner solutions that you need.

To learn more, see Run privileged workloads from GKE Autopilot partners.

February 25, 2025

Three new metrics are added for checking node and node pool status:

  • kubernetes.io/node/status_condition: The condition of a node from the node status condition field. The Ready field has Unknown status if the node controller has not heard from the node in the last node-monitor-grace-period period. This metric is available for clusters with GKE version 1.32.1-gke.1260000 and later.

  • kubernetes.io/node_pool/multi_host/available: The multi-host NodePool availability. When all the nodes in the node pool are available, the value is True. If any of the nodes in the node pool are unavailable, the value is False. This metric is available for Multi-host TPU node pools only.

  • kubernetes.io/node_pool/status: The current status of the node pool from the NodePool instance. Status updates happen after GKE API operations complete. This metric is available for Multi-host TPU node pools only.

February 20, 2025

GKE Managed NVIDIA Data Center GPU Manager (DCGM) Metrics Package is now generally available for both GKE Standard and Autopilot clusters running version 1.32.0-gke.1764000 and later. You can enable the feature via the Console, gcloud, or Terraform. Starting with cluster version 1.32.1-gke.1357000, GKE Managed NVIDIA DCGM will be default-on for new clusters.

GKE Managed DCGM provides a curated set of metrics for monitoring the utilization, performance, and health of NVIDIA GPUs. These metrics are collected by Google Cloud Managed Service for Prometheus and you can view the metric charts in the Observability Tab on the Kubernetes Clusters page or in Cloud Monitoring. For more information, see Collect and view DCGM metrics.

To learn more, see Collect and view DCGM metrics.

GKE automatically adds the following resource labels to node pools:

  • goog-gke-accelerator-type: The accelerator type used in the node pool.
  • goog-gke-tpu-node-pool-type: The TPU node pool type, which can be single-host or multi-host.
  • goog-gke-node-pool-provisioning-model: The provisioning model of the node pool. The nodes can be on demand, by reservation, or Spot VMs.

To learn more, see Automatically applied labels.

February 04, 2025

GKE cluster notifications have the following new capabilities:

For more details about the different types of cluster notifications GKE sends and how you can receive them, see Cluster notifications.

January 23, 2025

User-managed firewall rules for GKE LoadBalancer Services is now generally available on GKE clusters running version 1.31.3-gke.1056000 or later. By allowing user-managed firewall rules for GKE LoadBalancer Services, advanced firewall policies can now be configured to control ingress traffic to your GKE Services exposed with passthrough network load balancers. To learn more, see User-managed firewall rules for GKE LoadBalancer Services.

You can now customize a node system configuration with the following new kubelet and sysctl configuration options:

  • Kubelet

    • containerLogMaxSize
    • containerLogMaxFiles
    • imageGcLowThresholdPercent
    • imageGcHighThresholdPercent
    • imageMinimumGcAge
    • imageMaximumGcAge (1.30.7-gke.1076000 and later, 1.31.3-gke.1023000 and later)
    • allowedUnsafeSysctls (1.32.0-gke.1448000 and later)
  • Sysctl

    • kernel.shmmni
    • kernel.shmmax
    • kernel.shmall
    • net.netfilter.nf_conntrack_acct (1.32.0-gke.1448000 and later)
    • net.netfilter.nf_conntrack_max (1.32.0-gke.1448000 and later)
    • net.netfilter.nf_conntrack_buckets (1.32.0-gke.1448000 and later)
    • net.netfilter.nf_conntrack_tcp_timeout_close_wait (1.32.0-gke.1448000 and later)
    • net.netfilter.nf_conntrack_tcp_timeout_established (1.32.0-gke.1448000 and later)
    • net.netfilter.nf_conntrack_tcp_timeout_time_wait (1.32.0-gke.1448000 and later)

To learn more, see Kubelet configuration options and Sysctl configuration options.

January 21, 2025

You can now use A3 Ultra VM powered by NVIDIA H200 Tensor Core GPUs with our new Titanium ML network adapter, which delivers non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE).

A3 Ultra VMs are generally available in the a3-ultragpu-8g machine type and can be used through both the modes of operation in Google Kubernetes Engine (GKE):

December 16, 2024

Cloud DNS additive VPC scope is now generally available on GKE clusters running version 1.28.3-gke.1430000 or later. You can now configure your GKE clusters to add GKE headless service entries to your Cloud DNS private zone visible from your VPC networks, on top of using Cloud DNS (cluster scope) as your GKE DNS provider.

To learn more, read Cloud DNS scopes for GKE.

Trillium, our sixth-generation TPU, is now generally available. Support is available for GKE Standard clusters in version 1.31.1-gke.1846000 or later, and Autopilot clusters in version 1.31.2-gke.1384000 or later. You can use TPU Trillium in the us-east5-b, europe-west4-a, us-east1-d, asia-northeast1-b, and us-south1-a zones.

To learn more, see Benefits of using TPU Trillium.

December 13, 2024

GKE now provides insights and recommendations that help you identify and amend clusters running a minor version that reached end of standard support, clusters with nodes in violation of version skew policy, and clusters without a maintenance window to achieve reliable operations, up-to-date security posture and supportability.

The C4A machine family is generally available in the following versions:

  • Standard clusters in version 1.28.13-gke.1024000, 1.29.8-gke.1057000, 1.30.4-gke.1213000 or later. To use this family in GKE Standard, you can use the --machine-type flag when creating a cluster or node pool.

  • Autopilot clusters in 1.28.15-gke.1344000, 1.29.11-gke.1012000, 1.30.7-gke.1136000, 1.31.3-gke.1056000 or later. To use this family in GKE Autopilot, schedule your workloads along with the kubernetes.io/machine-family: c4a node selector. In versions 1.31 or above, the kubernetes.io/arch: arm64 node selector would default to C4A machine family.

Cluster autoscaler and node auto-provisioning are supported in 1.28.15-gke.1344000, 1.29.11-gke.1012000, 1.30.7-gke.1136000, 1.31.3-gke.1056000 or later.

Local SSD support is available for Public Preview from 1.31.1-gke.2008000. Contact your Account Team to participate in the preview.

December 02, 2024

In GKE version 1.31.1-gke.2105000 or later, you can now configure custom compute classes to consume Compute Engine reservations. Workloads that use those custom compute classes automatically trigger reservation consumption during node creation. This lets you manage reservation consumption more centrally. To learn more, see About custom compute classes.

November 27, 2024

Cloud TPU Trillium (v6e) machine types are now in public preview for Autopilot clusters running version 1.31.2-gke.1384000 or later. These TPUs are available in the following zones: us-east5-b, europe-west4-a, us-east1-d, asia-northeast1-b, and us-south1-a. To learn more, see Plan TPUs in GKE.

November 26, 2024

Cluster autoscaler and node auto-provisioning support the C4 machine family in GKE version 1.28.15-gke.1159000, 1.29.10-gke.1227000 or later.

November 20, 2024

You can now specify a custom resource policy as a compact placement policy with node auto-provisioning in clusters running GKE version 1.31.1-gke.2010000 or later. To learn more, see Use compact placement for node auto-provisioning.

November 19, 2024

GKE version 1.31 introduces increased scalability, allowing users to create clusters with up to 65,000 nodes. For clusters exceeding 5,000 nodes, a quota increase is required. Contact Google Cloud support to request this increase.

November 18, 2024

Performance horizontal Pod autoscaling (HPA) profile is now available in Preview for new and existing GKE clusters running version 1.31.2-gke.1138000 or later. This feature speeds up HPA reaction time and enables quick recalculation of up to 1,000 HPA objects. To learn more, see Configuring Performance HPA profile.

November 11, 2024

DNS-based access for GKE clusters control plane is now generally available. This capability provides each cluster with a unique domain name system (DNS) name or fully-qualified domain name (FQDN). Access to clusters is controlled through IAM policies, eliminating the need for bastion hosts or proxy nodes. Authorized users can connect to the control plane from different cloud networks, on-prem deployments, or from remote locations, without relying on proxies.

To learn more, see About network isolation in GKE.

November 07, 2024

GKE clusters running version 1.28 or later now support automatic application monitoring in public preview. Enabling this feature automatically deploys PodMonitoring configurations to capture key metrics for supported workloads like Apache Airflow, Istio, and RabbitMQ. These metrics are integrated with Cloud Monitoring dashboards for observability. To learn more, see Configure automatic application monitoring for workloads.

November 06, 2024

The GKE Volume Populator is generally available on GKE clusters running version 1.31.1-gke.1729000 or later. This feature provides a way to automate data transfer from a Google Cloud Storage bucket source storage to a destination PersistentVolumeClaim backed by a Parallelstore instance. To learn more, see Transfer data from Cloud Storage during dynamic provisioning using GKE Volume Populator.

November 05, 2024

Generally available: In GKE version 1.26 and later, Hyperdisk Balanced volumes can be created in Confidential mode for custom boot disks and persistent volumes and attached to Confidential GKE Nodes.

Cloud TPU v6e machine types are now in public preview for GKE clusters running version 1.30.4-gke.1167000 or later. These TPU VMs (ct6e-standard) are available in the following zones: us-east5-b, europe-west4-a, us-east1-d, asia-northeast1-b, and us-south1-a. To learn more, see Plan TPUs in GKE.

October 31, 2024

For GKE clusters running version 1.31.1-gke.1146000 or later, Cloud Tensor Processing Unit (TPU) v3 machine types are generally available. These TPU VMs (ct3-hightpu-4t and ct3p-hightpu-4t) are currently available in us-east1-d, europe-west4-a, us-central1-a, us-central1-b, and us-central1-f. To learn more, see TPUs in GKE.

GKE control plane authority is now generally available with version 1.31.1-gke.1846000 or later. GKE control plane authority provides enhanced visibility, security controls, and customization of the GKE control plane. For more information, see the About GKE control plane authority.

October 30, 2024

Weighted load balancing for GKE External LoadBalancer Services is now available in Preview. Weighted load balancing is a more efficient way to distribute traffic to nodes based on the number of serving Pods they have backing the Service. To learn more, see About LoadBalancer Services.

October 29, 2024

Three new metrics are added for measuring node and workload startup latency:

  • kubernetes.io/node/latencies/startup: The total startup latency of a node, from the GCE instance's CreationTimestamp to Kubernetes Node Ready for the first time.

  • kubernetes.io/pod/latencies/pod_first_ready: The Pod end-to-end startup latency (from Pod Created to Ready), including image pulls. This metric is available for clusters with GKE version 1.31.1-gke.1678000 or later.

  • kubernetes.io/autoscaler/latencies/per_hpa_recommendation_scale_latency_seconds: Horizontal Pod Autoscaling (HPA) scaling recommendation latency (the time between metrics being created and the corresponding scaling recommendation being applied to the API server) for the HPA target. This metric is available for clusters running the following versions or later:

    • 1.30.4-gke.1348001
    • 1.31.0-gke.1324000

October 28, 2024

The A3 Edge (a3-edgegpu-8g) machine type with H100 80GB GPUs attached is now available on GKE Standard clusters. To learn more, see About GPUs.

October 17, 2024

You can now use NVIDIA H100 80GB GPUs on GKE in the following smaller machine types:

  • a3-highgpu-1g (1 GPU)
  • a3-highgpu-2g (2 GPUs)
  • a3-highgpu-4g (4 GPUs)

These machine types are available through Dynamic Workload Scheduler Flex Start mode, Spot VMs in GKE Standard mode clusters, or Spot Pods in GKE Autopilot mode clusters. You can only provision these machine types if there's available capacity in your region.

GKE continues to support the 8 GPU H100 80GB machine types: a3-highgpu-8g and a3-megagpu-8g.

The new release of the GKE Gateway controller (2024-R2) is now generally available. With this release, the GKE Gateway controller provides the following new capabilities:

Conformance:

To learn more about our GKE Gateway controller capabilities, see the supported capabilities per GatewayClass.

In GKE clusters with the control plane running version 1.29.1-gke.1425000 or later, TPU slice nodes support SIGTERM signals that alert the node of an imminent shutdown. The imminent shutdown notification is configurable up to five minutes in TPU nodes. To configure GKE to terminate your workloads gracefully within this notification timeframe, see Manage GKE node disruption for GPUs and TPUs.

October 15, 2024

You can now create workloads with multiple network interfaces in GKE Autopilot clusters running version 1.29.5-gke.1091000 and later or version 1.30.1-gke.1280000 and later. For more information, see Setup multi-network support for Pods.

October 04, 2024

The following beta APIs were added in Kubernetes 1.31 and are available in GKE version 1.31.1-gke.1361000 and later:

  • networking.k8s.io/v1beta1/ipaddresses
  • networking.k8s.io/v1beta1/servicecidrs

Enabling both APIs at the same time enables the Multiple Service CIDRs Kubernetes feature in a GKE cluster. For more information, see the following resources:

During the beta phase, you can only create Service CIDRs in the 34.118.224.0/20 reserved IP address range to avoid possible issues with overlapping IP address ranges.

Ray Operator on GKE is now generally available on 1.29 and later. Ray Operator is a GKE add-on that lets you manage and scale Ray applications. To learn more, see the Ray Operator documentation.

October 01, 2024

GKE now supports the Parallelstore CSI driver in allowlisted general availability (GA), which means that you can reach out to your Google support team to use the service under GA terms.

Parallelstore accelerates AI/ML training and excels at saturating individual compute clients, ensuring that expensive compute resources are efficiently used. The product demonstrated a 3.9x training time improvement and 3.7x better throughput improvement compared to native ML framework data loaders and saturates single clients NIC bandwidth at 90%+.

For details, see About the GKE Parallelstore CSI driver.

In GKE version 1.30.3-gke.1639000 and later and 1.31.0-gke.1058000 and later, GKE can handle GPU and TPU node disruptions by notifying you in advance of a shutdown and by gracefully terminating your workloads. This feature is generally available. For details, see Manage GKE node disruption for GPUs and TPUs.

September 11, 2024

For GPU node pools created in GKE Standard clusters running version 1.30.1-gke.115600 or later, GKE automatically installs the default NVIDIA GPU driver version corresponding to the GKE version if you don't specify the gpu-driver-version flag.

August 27, 2024

Starting from version 1.30.3-gke.1451000, new and upgraded GKE clusters support the GKE Metrics Server updates where the addon-resizer runs in the cluster's control plane instead of worker nodes.

August 21, 2024

GKE support for Hyperdisk ML as an attached persistent disk option is now generally available. Support is available for both Autopilot and Standard clusters running GKE versions 1.30.2-gke.1394000 and later.

August 20, 2024

The C4 machine family is generally available in the following versions:

  • Standard clusters in version 1.29.2-gke.1521000 and later. To use this family in GKE Standard, you can use the --machine-type flag when creating a cluster or node pool.
  • Autopilot clusters in 1.30.3-gke.1225000 and later. To use this family in GKE Autopilot, you can use the Performance compute class when scheduling your workloads.
  • Cluster autoscaler and node auto-provisioning are supported in 1.30.3-gke.1225000 and later.

August 13, 2024

Custom compute classes are a new set of capabilities in GKE that provide an API for fine-grained control over fallback compute priorities, autoscaling configuration, obtainability and node consolidation. Custom compute classes offer enhanced flexibility and control over your GKE compute infrastructure so that you can ensure optimal resource allocation for your workloads. You can use custom compute classes in GKE version 1.30.3-gke.1451000 and later. To learn more, see About custom compute classes.

August 06, 2024

You can now keep a GKE Standard cluster on a minor version for longer with the Extended release channel. Clusters running 1.27 or later can be enrolled in the Extended channel, and automatically receive security patches during the extended support period after the end of standard support. To learn more, see Get long-term support with the Extended channel.

August 02, 2024

The NVIDIA GPU Operator can now be used as an alternative to fully managed GKE for both Container-Optimized OS and Ubuntu node images. Choose this option to manage your GPU stack if you're looking for a consistent multi-cloud experience, already using the NVIDIA GPU Operator, or have software reliant on it.

August 01, 2024

You can now enable NCCL Fast Socket on your multi-GPU Autopilot workloads. NCCL Fast Socket is a transport layer plugin designed to improve NVIDIA Collective Communication Library (NCCL) performance on Google Cloud. To enable NCCL Fast Socket on GKE Autopilot, you must use a GKE Autopilot cluster with control plane version 1.30.2-gke.1023000 or later. For more information, see Improve workload efficiency using NCCL Fast Socket.

July 31, 2024

You can now keep a GKE Standard cluster on a minor version for longer with the Extended release channel. Clusters running 1.27 or later can be enrolled in the Extended channel, and automatically receive security patches during the extended support period after the end of standard support. To learn more, see Get long-term support with the Extended channel.

July 16, 2024

Compute flexible committed use discounts (CUDs), previously known as Compute Engine Flexible CUDs, have been expanded to include several GKE Autopilot and Cloud Run SKUs (see the GKE CUD documentation for details). The legacy GKE Autopilot CUD will be removed from sale on October 15, 2024. GKE Autopilot CUDs purchased before this date will continue to apply through their term.

July 08, 2024

Ray Operator on GKE is now generally available in the Rapid channel. Ray Operator is a GKE add-on that allows you to manage and scale Ray applications. To learn more, see the Ray Operator documentation.

July 03, 2024

You can now preload data or container images in new nodes on GKE, enabling faster workload deployment and autoscaling. This feature is Generally Available and production-ready, with support for Autopilot and Terraform. To learn more, see Use secondary boot disks to preload data or container images.

GKE Managed DCGM Metrics Package is now available in Preview for both GKE Standard and Autopilot clusters running version 1.30.1-gke.1204000 and later.

You can now configure Autopilot and Standard clusters to export a predefined list of DCGM metrics emitted by GKE Managed DCGM exporter including metrics for GPU performance, utilization, and I/Os in the GPU node pools with GKE-managed NVIDIA drivers. These metrics are collected by Google Cloud Managed Service for Prometheus. You can view the curated DCGM metrics in the Observability Tab on the Kubernetes Clusters page or in Cloud Monitoring.

For more information, see Collect and view DCGM metrics.