External Application Load Balancer overview

This document introduces the concepts that you need to understand how to configure an external Application Load Balancer.

An external Application Load Balancer is a proxy-based Layer 7 load balancer that enables you to run and scale your services behind a single external IP address. The external Application Load Balancer distributes HTTP and HTTPS traffic to backends hosted on a variety of Trusted Cloud platforms (such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and so on), as well as external backends connected over the internet or via hybrid connectivity. For details, see Application Load Balancer overview: Use cases.

Modes of operation

This load balancer is available to you in the regional mode, and is hereafter referred to as a regional external Application Load Balancer. The load balancer is implemented as a managed service based on the open-source Envoy proxy. It includes advanced traffic management capabilities such as traffic mirroring, weight-based traffic splitting, request- or response-based header transformations, and more. The regional mode ensures that all clients and backends are in a specified region. Use this load balancer if you want to serve content from only one geolocation (for example, to meet compliance regulations).

Architecture

The following resources are required for an external Application Load Balancer deployment:

  • For regional external Application Load Balancers only, a proxy-only subnet is used to send connections from the load balancer to the backends.

  • An external forwarding rule specifies an external IP address, port, and target HTTP(S) proxy. Clients use the IP address and port to connect to the load balancer.

  • A target HTTP(S) proxy receives a request from the client. The HTTP(S) proxy evaluates the request by using the URL map to make traffic routing decisions. The proxy can also authenticate communications by using SSL certificates.

    • For HTTPS load balancing, the target HTTPS proxy uses SSL certificates to prove its identity to clients. A target HTTPS proxy supports up to the documented number of SSL certificates.
  • The HTTP(S) proxy uses a URL map to make a routing determination based on HTTP attributes (such as the request path, cookies, or headers). Based on the routing decision, the proxy forwards client requests to specific backend services or backend buckets. The URL map can specify additional actions, such as sending redirects to clients.

  • A backend service distributes requests to healthy backends.

  • A health check periodically monitors the readiness of your backends. This reduces the risk that requests might be sent to backends that can't service the request.

  • Firewall rules for your backends to accept health check probes. Regional external Application Load Balancers require an additional firewall rule to allow traffic from the proxy-only subnet to reach the backends.

Regional

This diagram shows the components of a regional external Application Load Balancer deployment.

Regional external Application Load Balancer components.
Regional external Application Load Balancer components (click to enlarge).

Proxy-only subnet

The proxy-only subnet provides a set of IP addresses that Google uses to run Envoy proxies on your behalf. You must create one proxy-only subnet in each region of a VPC network where you use regional external Application Load Balancers. The --purpose flag for this proxy-only subnet is set to REGIONAL_MANAGED_PROXY. All regional Envoy-based load balancers in the same region and VPC network share a pool of Envoy proxies from the same proxy-only subnet. Further:

  • Proxy-only subnets are only used for Envoy proxies, not your backends.
  • Backend VMs or endpoints of all regional external Application Load Balancers in a region and VPC network receive connections from the proxy-only subnet.
  • The IP address of the regional external Application Load Balancer is not located in the proxy-only subnet. The load balancer's IP address is defined by its external managed forwarding rule, which is described below.

If you previously created a proxy-only subnet with --purpose=INTERNAL_HTTPS_LOAD_BALANCER, you need to migrate the subnet's purpose to REGIONAL_MANAGED_PROXY before you can create other Envoy-based load balancers in the same region of the VPC network.

Forwarding rules and IP addresses

Forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy, URL map, and one or more backend services.

IP address specification. Each forwarding rule provides a single IP address that can be used in DNS records for your application. No DNS-based load balancing is required. You can either specify the IP address to be used or let Cloud Load Balancing assign one for you.

Port specification. Each forwarding rule for an Application Load Balancer can reference a single port from 1-65535. To support multiple ports, you must configure multiple forwarding rules. You can configure multiple forwarding rules to use the same external IP address (VIP) and to reference the same target HTTP(S) proxy as long as the overall combination of IP address, port, and protocol is unique for each forwarding rule. This way, you can use a single load balancer with a shared URL map as a proxy for multiple applications.

The type of forwarding rule, IP address, and load balancing scheme used by external Application Load Balancers depends on the mode of the load balancer and which Network Service Tier the load balancer is in.

Load balancer mode Network Service Tier Forwarding rule, IP address, and load balancing scheme Routing from the internet to the load balancer frontend
Regional external Application Load Balancer Premium Tier

Regional external forwarding rule

Regional external IP address

Load balancing scheme:
EXTERNAL_MANAGED

Requests reach Trusted Cloud at the PoP closest to the client. Requests are then routed over Trusted Cloud's premium backbone until they reach Envoy proxies in the same region as the load balancer.

For the complete list of protocols supported by external Application Load Balancer forwarding rules in each mode, see Load balancer features.

Forwarding rules and VPC networks

This section describes how forwarding rules used by external Application Load Balancers are associated with VPC networks.

Load balancer mode VPC network association
Regional external Application Load Balancer

The forwarding rule's VPC network is the network where the proxy-only subnet has been created. You specify the network when you create the forwarding rule.

Depending on whether you use an IPv4 address or an IPv6 address range, there is always an explicit or implicit VPC network associated with the forwarding rule.

  • Regional external IPv4 addresses always exist outside of VPC networks. However, when you create the forwarding rule, you're required to specify the VPC network where the proxy-only subnet has been created. Therefore, the forwarding rule has an explicit network association.
  • Regional external IPv6 address ranges always exist inside a VPC network. When you create the forwarding rule, you're required to specify the subnet from which the IP address range is taken. This subnet must be in the same region and VPC network where a proxy-only subnet has been created. Thus, there is an implied network association.

Target proxies

Target proxies terminate HTTP(S) connections from clients. One or more forwarding rules direct traffic to the target proxy, and the target proxy consults the URL map to determine how to route traffic to backends.

Do not rely on the proxy to preserve the case of request or response header names. For example, a Server: Apache/1.0 response header might appear at the client as server: Apache/1.0.

The following table specifies the type of target proxy required by external Application Load Balancers.

Load balancer mode Target proxy types Proxy-added headers Custom headers supported
Regional external Application Load Balancer Regional HTTP,
Regional HTTPS
  • X-Forwarded-Proto: [http | https] (requests only)
  • Via: 1.1 google (requests and responses)
  • X-Forwarded-For: [<supplied-value>,]<client-ip>,<load-balancer-ip> (see X-Forwarded-For header) (requests only)
Configured in the URL map

In addition to headers added by the target proxy, the load balancer adjusts other HTTP headers in the following ways:

  • Some headers are coalesced. When there are multiple instances of the same header key (for example, Via), the load balancer combines their values into a single comma-separated list for a single header key. Only the headers whose values can be represented as a comma-separated list are coalesced. Other headers, such as Set-Cookie, are never coalesced.

Host header

When the load balancer makes the HTTP request, the load balancer preserves the Host header of the original request.

X-Forwarded-For header

The load balancer appends two IP addresses to the X-Forwarded-For header, separated by a single comma, in the following order:

  1. The IP address of the client that connects to the load balancer
  2. The IP address of the load balancer's forwarding rule

If the incoming request does not include an X-Forwarded-For header, the resulting header is as follows:

X-Forwarded-For: <client-ip>,<load-balancer-ip>

If the incoming request already includes an X-Forwarded-For header, the load balancer appends its values to the existing header:

X-Forwarded-For: <existing-value>,<client-ip>,<load-balancer-ip>

Remove existing header values using a custom request header

It is possible to remove existing header values by using custom request headers on the backend service. The following example uses the --custom-request-header flag to recreate the X-Forwarded-For header by using the variables client_ip_address and server_ip_address. This configuration replaces the incoming X-Forwarded-For header with only the client and the load balancer IP address.

--custom-request-header=x-forwarded-for:{client_ip_address},{server_ip_address}

How backend reverse proxy software might modify the X-Forwarded-For header

If your load balancer's backends run HTTP reverse proxy software, the software might append one or both of the following IP addresses to the end of the X-Forwarded-For header:

  1. The IP address of the GFE that connected to the backend. GFE IP addresses are in the 130.211.0.0/22 and 35.191.0.0/16 ranges.

  2. The IP address of the backend system itself.

As a result, an upstream system might see an X-Forwarded-For header structured as follows:

<existing-value>,<client-ip>,<load-balancer-ip>,<GFE-ip>,<backend-ip>

Cloud Trace support

Trace is not supported with Application Load Balancers. The global and classic Application Load Balancers add the X-Cloud-Trace-Context header if it is not present. The regional external Application Load Balancer does not add this header. If the X-Cloud-Trace-Context header is already present, it passes through the load balancers unmodified. However, no traces or spans are exported by the load balancer.

URL maps

URL maps define matching patterns for URL-based routing of requests to the appropriate backend services. The URL map allows you to divide your traffic by examining the URL components to send requests to different sets of backends. A default service is defined to handle any requests that do not match a specified host rule or path matching rule.

URL maps support several advanced traffic management features such as header-based traffic steering, weight-based traffic splitting, and request mirroring. For more information, see the following:

The following table specifies the type of URL map required by external Application Load Balancers in each mode.

Load balancer mode URL map type
Regional external Application Load Balancer Regional

SSL certificates

External Application Load Balancers using target HTTPS proxies require private keys and SSL certificates as part of the load balancer configuration.

Regional external Application Load Balancers using target HTTPS proxies require private keys and SSL certificates as part of the load balancer configuration.

Regional external Application Load Balancers support self-managed Compute Engine SSL certificates.

SSL policies

SSL policies specify the set of SSL features that Trusted Cloud load balancers use when negotiating SSL with clients.

By default, HTTPS Load Balancing uses a set of SSL features that provides good security and wide compatibility. Some applications require more control over which SSL versions and ciphers are used for their HTTPS or SSL connections. You can define an SSL policy to specify the set of SSL features that your load balancer uses when negotiating SSL with clients. In addition, you can apply that SSL policy to your target HTTPS proxy.

The following table specifies the SSL policy support for load balancers in each mode.

Load balancer mode SSL policies supported
Regional external Application Load Balancer

Backend services

A backend service provides configuration information to the load balancer so that it can direct requests to its backends—for example, Compute Engine instance groups or network endpoint groups (NEGs). For more information about backend services, see Backend services overview.

Backend service scope

The following table indicates which backend service resource and scope is used by external Application Load Balancers:

Load balancer mode Backend service resource
Regional external Application Load Balancer regionBackendServices (regional)

Protocol to the backends

Backend services for Application Load Balancers must use one of the following protocols to send requests to backends:

  • HTTP, which uses HTTP/1.1 and no TLS
  • HTTPS, which uses HTTP/1.1 and TLS
  • HTTP/2, which uses HTTP/2 and TLS (HTTP/2 without encryption isn't supported.)
  • H2C, which uses HTTP/2 over TCP. TLS isn't required. H2C isn't supported for classic Application Load Balancers.

The load balancer only uses the backend service protocol that you specify to communicate with its backends. The load balancer doesn't fall back to a different protocol if it is unable to communicate with backends using the specified backend service protocol.

The backend service protocol doesn't need to match the protocol used by clients to communicate with the load balancer. For example, clients can send requests to the load balancer using HTTP/2, but the load balancer can communicate with backends using HTTP/1.1 (HTTP or HTTPS).

Backends

A regional external Application Load Balancer supports the following types of backends:

  • Instance groups
  • Zonal NEGs
  • Internet NEGs

Backends and VPC networks

For regional external Application Load Balancer backends, the following applies:

  • For instance groups, zonal NEGs, and hybrid connectivity NEGs, all backends must be located in the same project and region as the backend service. However, a load balancer can reference a backend that uses a different VPC network in the same project as the backend service. Connectivity between the load balancer's VPC network and the backend VPC network can be configured using either VPC Network Peering, Cloud VPN tunnels, or Cloud Interconnect VLAN attachments.

    Backend network definition

    • For zonal NEGs and hybrid NEGs, you explicitly specify the VPC network when you create the NEG.
    • For managed instance groups, the VPC network is defined in the instance template.
    • For unmanaged instance groups, the instance group's VPC network is set to match the VPC network of the nic0 interface for the first VM added to the instance group.

    Backend network requirements

    Your backend's network must satisfy one of the following network requirements:

    • The backend's VPC network must exactly match the forwarding rule's VPC network.

    • The backend's VPC network must be connected to the forwarding rule's VPC network using VPC Network Peering. You must configure subnet route exchanges to allow communication between the proxy-only subnet in the forwarding rule's VPC network and the subnets used by the backend instances or endpoints.

  • For all other backend types, all backends must be located in the same VPC network and region.

    Regional external Application Load Balancers also support Shared VPC environments where you can share VPC networks and their associated resources across projects. If you want the regional external Application Load Balancer's backend service and backends to be in a different project from the forwarding rule, you need to configure the load balancer in a Shared VPC environment with cross-project service referencing.

Backends and network interfaces

If you use instance group backends, packets are always delivered to nic0. If you want to send packets to non-nic0 interfaces (either vNICs or Dynamic Network Interfaces), use NEG backends instead.

If you use zonal NEG backends, packets are sent to whatever network interface is represented by the endpoint in the NEG. The NEG endpoints must be in the same VPC network as the NEG's explicitly defined VPC network.

Health checks

Each backend service specifies a health check that periodically monitors the backends' readiness to receive a connection from the load balancer. This reduces the risk that requests might be sent to backends that can't service the request. Health checks do not check if the application itself is working.

For the health check probes, you must create an ingress allow firewall rule that allows health check probes to reach your backend instances. Typically, health check probes originate from Google's centralized health checking mechanism.

Regional external Application Load Balancers that use hybrid NEG backends are an exception to this rule because their health checks originate from the proxy-only subnet instead. For details, see the Hybrid NEGs overview.

Health check protocol

Although it is not required and not always possible, it is a best practice to use a health check whose protocol matches the protocol of the backend service. For example, an HTTP/2 health check most accurately tests HTTP/2 connectivity to backends. In contrast, regional external Application Load Balancers that use hybrid NEG backends do not support gRPC health checks. For the list of supported health check protocols, see Load balancing features.

The following table specifies the scope of health checks supported by external Application Load Balancers in each mode.

Load balancer mode Health check type
Regional external Application Load Balancer Regional

For more information about health checks, see the following:

Firewall rules

The load balancer requires the following firewall rules:

  • For the regional external Application Load Balancer, an ingress allow rule to permit traffic from the proxy-only subnet to reach your backends.
  • An ingress allow rule to permit traffic from the health check probes ranges. For more information about health check probes and why it's necessary to allow traffic from them, see Probe IP ranges and firewall rules.

Firewall rules are implemented at the VM instance level, not on GFE proxies. You cannot use Trusted Cloud firewall rules to prevent traffic from reaching the load balancer.

The ports for these firewall rules must be configured as follows:

  • Allow traffic to the destination port for each backend service's health check.

  • For instance group backends: Determine the ports to be configured by the mapping between the backend service's named port and the port numbers associated with that named port on each instance group. The port numbers can vary among instance groups assigned to the same backend service.

  • For GCE_VM_IP_PORT NEG backends: Allow traffic to the port numbers of the endpoints.

GKE support

GKE uses external Application Load Balancers in the following ways:

  • External Gateways created using the GKE Gateway controller can use any mode of an external Application Load Balancer. You control the load balancer's mode by choosing a GatewayClass. The GKE Gateway controller always uses GCE_VM_IP_PORT zonal NEG backends.

Shared VPC architecture

External Application Load Balancers support networks that use Shared VPC. Shared VPC lets organizations connect resources from multiple projects to a common VPC network so that they can communicate with each other securely and efficiently by using internal IP addresses from that network. If you're not already familiar with Shared VPC, read the Shared VPC overview.

There are many ways to configure an external Application Load Balancer within a Shared VPC network. Regardless of type of deployment, all the components of the load balancer must be in the same organization.

Load balancer Frontend components Backend components
Regional external Application Load Balancer

Create the required network and proxy-only subnet in the Shared VPC host project.

The regional external IP address, the forwarding rule, the target HTTP(S) proxy, and the associated URL map must be defined in the same project. This project can be the host project or a service project.

You can do one of the following:
  • Create backend services and backends (instance groups, serverless NEGs, or any other supported backend types) in the same service project as the frontend components.
  • Create backend services and backends (instance groups, serverless NEGs, or any other supported backend types) in as many service projects as required. A single URL map can reference backend services across different projects. This type of deployment is known as cross-project service referencing.

Each backend service must be defined in the same project as the backends it references. Health checks associated with backend services must be defined in the same project as the backend service as well.

While you can create all the load balancing components and backends in the Shared VPC host project, this type of deployment does not separate network administration and service development responsibilities.

All load balancer components and backends in a service project

The following architecture diagram shows a standard Shared VPC deployment where all load balancer components and backends are in a service project. This deployment type is supported by all Application Load Balancers.

The load balancer components and backends must use the same VPC network.

Regional external Application Load Balancer on Shared VPC network.
Regional external Application Load Balancer on Shared VPC network (click to enlarge).

Cross-project service referencing

Cross-project service referencing is a deployment model where the load balancer's frontend and URL map are in one project and the load balancer's backend service and backends are in a different project.

Cross-project service referencing lets organizations configure one central load balancer and route traffic to hundreds of services distributed across multiple different projects. You can centrally manage all traffic routing rules and policies in one URL map. You can also associate the load balancer with a single set of hostnames and SSL certificates. You can therefore optimize the number of load balancers needed to deploy your application, and lower manageability, operational costs, and quota requirements.

By having different projects for each of your functional teams, you can also achieve separation of roles within your organization. Service owners can focus on building services in service projects, while network teams can provision and maintain load balancers in another project, and both can be connected by using cross-project service referencing.

Service owners can maintain autonomy over the exposure of their services and control which users can access their services by using the load balancer. This is achieved by a special IAM role called the Compute Load Balancer Services User role (roles/compute.loadBalancerServiceUser).

Cross-project service referencing support differs based on the type of load balancer:

  • For global external Application Load Balancers: your load balancer's frontend and URL map can reference backend services or backend buckets from any project within the same organization. No VPC network restrictions apply. While you can use a Shared VPC environment to configure a cross-project deployment as shown in this example, this isn't a requirement.

  • For regional external Application Load Balancers: you must create the load balancer in a Shared VPC environment. The load balancer's frontend and URL map must be in a host or service project, and the load balancer's backend services and backends can be distributed across host or service projects in the same Shared VPC environment.

To learn how to configure Shared VPC for a global external Application Load Balancer—with and without cross-project service referencing—see Set up a global external Application Load Balancer with Shared VPC.

To learn how to configure Shared VPC for a regional external Application Load Balancer—with and without cross-project service referencing—see Set up a regional external Application Load Balancer with Shared VPC.

Usage notes for cross-project service referencing

  • Trusted Cloud doesn't differentiate between resources (for example, backend services) using the same name across multiple projects. Therefore, when you are using cross-project service referencing, we recommend that you use unique backend service names across projects within your organization.
  • If you see an error such as "Cross-project references for this resource are not allowed", make sure that you have the permission to use the resource. An administrator of the project that owns the resource must grant you the Compute Load Balancer Services User role (roles/compute.loadBalancerServiceUser). This role can be granted at the project level or at the resource level. For an example, see Grant permissions to the Compute Load Balancer Admin to use the backend service or backend bucket.

Example 1: Load balancer frontend and backend in different service projects

Here is an example of a Shared VPC deployment where the load balancer's frontend and URL map are created in service project A and the URL map references a backend service in service project B.

In this case, Network Admins or Load Balancer Admins in service project A require access to backend services in service project B. Service project B admins grant the Compute Load Balancer Services User role (roles/compute.loadBalancerServiceUser) to Load Balancer Admins in service project A who want to reference the backend service in service project B.

Load balancer frontend and URL map in service project.
Load balancer frontend and backend in different service projects (click to enlarge).

Example 2: Load balancer frontend in the host project and backends in service projects

Here is an example of a Shared VPC deployment where the load balancer's frontend and URL map are created in the host project and the backend services (and backends) are created in service projects.

In this case, Network Admins or Load Balancer Admins in the host project require access to backend services in the service project. Service project admins grant the Compute Load Balancer Services User role (roles/compute.loadBalancerServiceUser) to to Load Balancer Admins in the host project A who want to reference the backend service in the service project.

Load balancer frontend and URL map in host project.
Load balancer frontend and URL map in host project (click to enlarge).

Example 3: Load balancer frontend and backends in different projects

Here is an example of a deployment where the global external Application Load Balancer's frontend and URL map are created in a different project from the load balancer's backend service and backends. This type of deployment doesn't use Shared VPC and is supported only for global external Application Load Balancers.

Load balancer frontend and backends in different projects.
Load balancer frontend and backends in different projects (click to enlarge).

To learn more about this setup, see Set up cross-project service referencing with a backend service and a backend bucket.

How connections work

Regional external Application Load Balancer connections

The regional external Application Load Balancer is a managed service implemented on the Envoy proxy. The regional external Application Load Balancer uses a shared subnet called a proxy-only subnet to provision a set of IP addresses that Google uses to run Envoy proxies on your behalf. The --purpose flag for this proxy-only subnet is set to REGIONAL_MANAGED_PROXY. All regional Envoy-based load balancers in a particular network and region share this subnet.

Clients use the load balancer's IP address and port to connect to the load balancer. Client requests are directed to the proxy-only subnet in the same region as the client. The load balancer terminates clients requests and then opens new connections from the proxy-only subnet to your backends. Therefore, packets sent from the load balancer have source IP addresses from the proxy-only subnet.

Depending on the backend service configuration, the protocol used by Envoy proxies to connect to your backends can be HTTP, HTTPS, or HTTP/2. If HTTP or HTTPS, the HTTP version is HTTP 1.1. HTTP keepalive is enabled by default, as specified in the HTTP 1.1 specification. The Envoy proxy sets both the client HTTP keepalive timeout and the backend keepalive timeout to a default value of 600 seconds each. You can update the client HTTP keepalive timeout but the backend keepalive timeout value is fixed. You can configure the request/response timeout by setting the backend service timeout. For more information, see timeouts and retries.

Client communications with the load balancer

  • Clients can communicate with the load balancer by using the HTTP 1.1 or HTTP/2 protocol.
  • When HTTPS is used, modern clients default to HTTP/2. This is controlled on the client, not on the HTTPS load balancer.
  • You cannot disable HTTP/2 by making a configuration change on the load balancer. However, you can configure some clients to use HTTP 1.1 instead of HTTP/2. For example, with curl, use the --http1.1 parameter.
  • External Application Load Balancers support the HTTP/1.1 100 Continue response.

For the complete list of protocols supported by external Application Load Balancer forwarding rules in each mode, see Load balancer features.

Source IP addresses for client packets

The source IP address for packets, as seen by the backends, is not the Trusted Cloud external IP address of the load balancer. In other words, there are two TCP connections.

For the regional external Application Load Balancers:
  • Connection 1, from original client to the load balancer (proxy-only subnet):

    • Source IP address: the original client (or external IP address if the client is behind a NAT gateway or a forward proxy).
    • Destination IP address: your load balancer's IP address.
  • Connection 2, from the load balancer (proxy-only subnet) to the backend VM or endpoint:

    • Source IP address: an IP address in the proxy-only subnet that is shared among all the Envoy-based load balancers deployed in the same region and network as the load balancer.

    • Destination IP address: the internal IP address of the backend VM or container in the VPC network.

Special routing paths

Trusted Cloud uses special routes not defined in your VPC network to route packets for the following types of traffic:

Trusted Cloud uses subnet routes for proxy-only subnets to route packets for the following types of traffic:

  • When using distributed Envoy health checks.

For regional external Application Load Balancers, Trusted Cloud uses open-source Envoy proxies to terminate client requests to the load balancer. The load balancer terminates the TCP session and opens a new TCP session from the region's proxy- only subnet to your backend. Routes defined within your VPC network facilitate communication from Envoy proxies to your backends and from your backends to the Envoy proxies.

TLS termination

The following table summarizes how TLS termination is handled by external Application Load Balancers.

Load balancer mode TLS termination
Regional external Application Load Balancer TLS is terminated on Envoy proxies located in a proxy-only subnet in a region chosen by the user. Use this load balancer mode if you need geographic control over the region where TLS is terminated.

Timeouts and retries

External Application Load Balancers support the following types of timeouts for HTTP or HTTPS traffic:

Timeout type and description Default values Supports custom timeout values
Backend service timeout1

A request and response timeout. Represents the maximum amount of time allowed between the load balancer sending the first byte of a request to the backend and the backend returning the last byte of the HTTP response to the load balancer. If the backend hasn't returned the entire HTTP response to the load balancer within this time limit, the remaining response data is dropped.

  • For all other backend types on a backend service: 30 seconds
Client HTTP keepalive timeout

The maximum amount of time that the TCP connection between a client and the load balancer's proxy can be idle. (The same TCP connection might be used for multiple HTTP requests.)

  • For regional external Application Load Balancers, the load balancer's proxy is Envoy software.
610 seconds
Backend HTTP keepalive timeout

The maximum amount of time that the TCP connection between the load balancer's proxy and a backend can be idle. (The same TCP connection might be used for multiple HTTP requests.)

  • For regional external Application Load Balancers, the load balancer's proxy is Envoy software.
  • For backend services: 10 minutes (600 seconds)

1Not configurable for serverless NEG backends. Not configurable for backend buckets.

Backend service timeout

The configurable backend service timeout represents the maximum amount of time that the load balancer waits for your backend to process an HTTP request and return the corresponding HTTP response. Except for serverless NEGs, the default value for the backend service timeout is 30 seconds.

For example, if you want to download a 500-MB file, and the value of the backend service timeout is 90 seconds, the load balancer expects the backend to deliver the entire 500-MB file within 90 seconds. It is possible to configure the backend service timeout to be insufficient for the backend to send its complete HTTP response. In this situation, if the load balancer has at least received HTTP response headers from the backend, the load balancer returns the complete response headers and as much of the response body as it could obtain within the backend service timeout.

We recommend that you set the backend service timeout to the longest amount of time that you expect your backend to need in order to process an HTTP response. If the software running on your backend needs more time to process an HTTP request and return its entire response, we recommend that you increase the backend service timeout. For example, we recommend that you increase the timeout if you see HTTP 408 status code responses with jsonPayload.statusDetail client_timed_out errors.

The backend service timeout accepts values between 1 and 2,147,483,647 seconds; however, larger values aren't practical configuration options. Trusted Cloud also doesn't guarantee that an underlying TCP connection can remain open for the entirety of the value of the backend service timeout. Client systems must implement retry logic instead of relying on a TCP connection to be open for long periods of time.

To configure the backend service timeout, use one of the following methods:

Console

Modify the Timeout field of the load balancer's backend service.

gcloud

Use the gcloud compute backend-services update command to modify the --timeout parameter of the backend service resource.

API

For a regional external Application Load Balancer, modify the timeoutSec parameter for the regionBackendServices resource.

Websocket connection timeouts aren't always the same as backend service timeouts. Websocket connection timeouts depend on the type of load balancer:

Load balancer mode Default values Timeout description for websockets
Regional external Application Load Balancer backend service timeout: 30 seconds

Active websocket connections don't use the backend service timeout of the load balancer.

Idle websocket connections are closed after the backend service times out.

Trusted Cloud periodically restarts or changes the number of serving Envoy software tasks. The longer the backend service timeout value is, the more likely it is that Envoy tasks restart or terminate TCP connections.

Regional external Application Load Balancers use the configured routeActions.timeout parameter of the URL maps and ignores the backend service timeout. When routeActions.timeout isn't configured, the value of the backend service timeout is used. When routeActions.timeout is supplied, the backend service timeout is ignored, and the value of routeActions.timeout is used as the request and response timeout instead.

Client HTTP keepalive timeout

The client HTTP keepalive timeout represents the maximum amount of time that a TCP connection can be idle between the (downstream) client and one of the following types of proxies:

  • For a regional external Application Load Balancer: an Envoy proxy

The client HTTP keepalive timeout represents the TCP idle timeout for the underlying TCP connections. The client HTTP keepalive timeout doesn't apply to websockets.

The default value for the client HTTP keepalive timeout is 610 seconds. For global and regional external Application Load Balancers, you can configure the client HTTP keepalive timeout with a value between 5 and 1200 seconds.

To configure the client HTTP keepalive timeout, use one of the following methods:

Console

Modify the HTTP keepalive timeout field of the load balancer's frontend configuration.

gcloud

For global external Application Load Balancers, use the gcloud compute target-http-proxies update command or the gcloud compute target-https-proxies update command to modify the --http-keep-alive-timeout-sec parameter of the target HTTP proxy or the target HTTPS proxy resource.

For a regional external Application Load Balancer, you cannot update the keepalive timeout parameter of a regional target HTTP(S) proxy directly. To update the keepalive timeout parameter of a regional target proxy, you need to do the following:

  1. Create a new target proxy with the intended timeout settings.
  2. Mirror all other settings from the current target proxy on the new one. For target HTTPS proxies, this includes linking any SSL certificates or certificate maps to the new target proxy.
  3. Update the forwarding rules to point to the new target proxy.
  4. Delete the previous target proxy.

API

For global external Application Load Balancers, modify the httpKeepAliveTimeoutSec parameter for the targetHttpProxies resource or the targetHttpsProxies resource.

For a regional external Application Load Balancer, you cannot update the keepalive timeout parameter of a regional target HTTP(S) proxy directly. To update the keepalive timeout parameter of a regional target proxy, you need to do the following:

  1. Create a new target proxy with the intended timeout settings.
  2. Mirror all other settings from the current target proxy on the new one. For target HTTPS proxies, this includes linking any SSL certificates or certificate maps to the new target proxy.
  3. Update the forwarding rules to point to the new target proxy.
  4. Delete the previous target proxy.

The load balancer's client HTTP keepalive timeout must be greater than the HTTP keepalive (TCP idle) timeout used by downstream clients or proxies. If a downstream client has a greater HTTP keepalive (TCP idle) timeout than the load balancer's client HTTP keepalive timeout, it's possible for a race condition to occur. From the perspective of a downstream client, an established TCP connection is permitted to be idle for longer than permitted by the load balancer. This means that the downstream client can send packets after the load balancer considers the TCP connection to be closed. When that happens, the load balancer responds with a TCP reset (RST) packet.

When the client HTTP keepalive timeout expires, either the GFE or the Envoy proxy sends a TCP FIN to the client to gracefully close the connection.

Backend HTTP keepalive timeout

External Application Load Balancers are proxies that use at least two TCP connections:

  • For a regional external Application Load Balancer, a first TCP connection exists between the (downstream) client and an Envoy proxy. The Envoy proxy then opens a second TCP connection to your backends.

The load balancer's secondary TCP connections might not get closed after each request; they can stay open to handle multiple HTTP requests and responses. The backend HTTP keepalive timeout defines the TCP idle timeout between the load balancer and your backends. The backend HTTP keepalive timeout doesn't apply to websockets.

The backend keepalive timeout is fixed at 10 minutes (600 seconds) and cannot be changed. This helps ensure that the load balancer maintains idle connections for at least 10 minutes. After this period, the load balancer can send termination packets to the backend at any time.

The load balancer's backend keepalive timeout must be less than the keepalive timeout used by software running on your backends. This avoids a race condition where the operating system of your backends might close TCP connections with a TCP reset (RST). Because the backend keepalive timeout for the load balancer isn't configurable, you must configure your backend software so that its HTTP keepalive (TCP idle) timeout value is greater than 600 seconds.

When the backend HTTP keepalive timeout expires, either the GFE or the Envoy proxy sends a TCP FIN to the backend VM to gracefully close the connection.

The following table lists the changes necessary to modify keepalive timeout values for common web server software.

Web server software Parameter Default setting Recommended setting
Apache KeepAliveTimeout KeepAliveTimeout 5 KeepAliveTimeout 620
nginx keepalive_timeout keepalive_timeout 75s; keepalive_timeout 620s;

Retries

Support for retry logic depends on the mode of the external Application Load Balancer.

Load balancer mode Retry logic
Regional external Application Load Balancer

Configurable by using a retry policy in the URL map. The default number of retries (numRetries) is 1. The maximum number of retries that can be configured by using the retry policy is 25. The maximum configurable perTryTimeout is 24 hours.

Without a retry policy, unsuccessful requests that have no HTTP body (for example, GET requests) that result in HTTP 502, 503, or 504 responses are retried once.

HTTP POST requests aren't retried.

Retried requests only generate one log entry for the final response.

The WebSocket protocol is supported with GKE Ingress.

Illegal request and response handling

The load balancer blocks both client requests and backend responses from reaching the backend or the client, respectively, for a number of reasons. Some reasons are strictly for HTTP/1.1 compliance and others are to avoid unexpected data being passed to or from the backends. None of the checks can be disabled.

The load balancer blocks the following requests for HTTP/1.1 compliance:

  • It cannot parse the first line of the request.
  • A header is missing the colon (:) delimiter.
  • Headers or the first line contain invalid characters.
  • The content length is not a valid number, or there are multiple content length headers.
  • There are multiple transfer encoding keys, or there are unrecognized transfer encoding values.
  • There's a non-chunked body and no content length specified.
  • Body chunks are unparseable. This is the only case where some data reaches the backend. The load balancer closes the connections to the client and backend when it receives an unparseable chunk.

Request handling

The load balancer blocks the request if any of the following are true:

  • The total size of request headers and the request URL exceeds the limit for the maximum request header size for external Application Load Balancers.
  • The request method doesn't allow a body, but the request has one.
  • The request contains an Upgrade header, and the Upgrade header isn't used to enable WebSocket connections.
  • The HTTP version is unknown.

Response handling

The load balancer blocks the backend's response if any of the following are true:

  • The total size of response headers exceeds the limit for maximum response header size for external Application Load Balancers.
  • The HTTP version is unknown.

When handling both the request and response, the load balancer might remove or overwrite hop-by-hop headers in HTTP/1.1 before forwarding them to the intended destination.

Traffic distribution

When you add a backend instance group or NEG to a backend service, you specify a balancing mode, which defines a method measuring backend load and a target capacity. External Application Load Balancers support two balancing modes:

  • RATE, for instance groups or NEGs, is the target maximum number of requests (queries) per second (RPS, QPS). The target maximum RPS/QPS can be exceeded if all backends are at or above capacity.

  • UTILIZATION is the backend utilization of VMs in an instance group.

How traffic is distributed among backends depends on the mode of the load balancer.

Regional external Application Load Balancer

For regional external Application Load Balancers, traffic distribution is based on the load balancing mode and the load balancing locality policy.

The balancing mode determines the weight and fraction of traffic to send to each group (instance group or NEG). The load balancing locality policy (LocalityLbPolicy) determines how backends within the group are load balanced.

When a backend service receives traffic, it first directs traffic to a backend (instance group or NEG) according to the backend's balancing mode. After a backend is selected, traffic is then distributed among instances or endpoints in that backend group according to the load balancing locality policy.

For more information, see the following:

Session affinity

Session affinity provides a best-effort attempt to send requests from a particular client to the same backend for as long as the backend is healthy and has the capacity, according to the configured balancing mode.

When you use session affinity, we recommend the RATE balancing mode rather than UTILIZATION. Session affinity works best if you set the balancing mode to requests per second (RPS).

External Application Load Balancers offer the following types of session affinity:

The following table summarizes the supported session affinity options supported by external Application Load Balancers:

Load balancer mode Session affinity options
  None Client IP Generated cookie Header field HTTP cookie Stateful cookie
Regional external Application Load Balancer

High availability and failover

High availability and failover in external Application Load Balancers can be configured at the load balancer level. This is handled by creating backup regional external Application Load Balancers in any region where you require backup.

The following table describes the failover behavior.

Load balancer mode Failover methods
Regional external Application Load Balancer

Use one of the following methods to ensure a highly available deployment:

HTTP/2 support

HTTP/2 is a major revision of the HTTP/1 protocol. There are 2 modes of HTTP/2 support:

  • HTTP/2 over TLS
  • Cleartext HTTP/2 over TCP

HTTP/2 over TLS

HTTP/2 over TLS is supported for connections between clients and the external Application Load Balancer, and for connections between the load balancer and its backends.

The load balancer automatically negotiates HTTP/2 with clients as part of the TLS handshake by using the ALPN TLS extension. Even if a load balancer is configured to use HTTPS, modern clients default to HTTP/2. This is controlled on the client, not on the load balancer.

If a client doesn't support HTTP/2 and the load balancer is configured to use HTTP/2 between the load balancer and the backend instances, the load balancer might still negotiate an HTTPS connection or accept unsecured HTTP requests. Those HTTPS or HTTP requests are then transformed by the load balancer to proxy the requests over HTTP/2 to the backend instances.

To use HTTP/2 over TLS, you must enable TLS on your backends and set the backend service protocol to HTTP2. For more information, see Encryption from the load balancer to the backends.

HTTP/2 max concurrent streams

The HTTP/2 SETTINGS_MAX_CONCURRENT_STREAMS setting describes the maximum number of streams that an endpoint accepts, initiated by the peer. The value advertised by an HTTP/2 client to a Trusted Cloud load balancer is effectively meaningless because the load balancer doesn't initiate streams to the client.

In cases where the load balancer uses HTTP/2 to communicate with a server that is running on a VM, the load balancer respects the SETTINGS_MAX_CONCURRENT_STREAMS value advertised by the server. If a value of zero is advertised, the load balancer can't forward requests to the server, and this might result in errors.

HTTP/2 limitations

  • HTTP/2 between the load balancer and the instance can require significantly more TCP connections to the instance than HTTP or HTTPS. Connection pooling, an optimization that reduces the number of these connections with HTTP or HTTPS, isn't available with HTTP/2. As a result, you might see high backend latencies because backend connections are made more frequently.
  • HTTP/2 between the load balancer and the backend doesn't support running the WebSocket Protocol over a single stream of an HTTP/2 connection (RFC 8441).
  • HTTP/2 between the load balancer and the backend doesn't support server push.
  • The gRPC error rate and request volume aren't visible in the Trusted Cloud API or the Trusted Cloud console. If the gRPC endpoint returns an error, the load balancer logs and the monitoring data report the 200 OK HTTP status code.

Cleartext HTTP/2 over TCP (H2C)

Cleartext HTTP/2 over TCP, also known as H2C, lets you use HTTP/2 without TLS. H2C is supported for both of the following connections:

  • Connections between clients and the load balancer. No special configuration is required.
  • Connections between the load balancer and its backends.

    To configure H2C for connections between the load balancer and its backends, you set the backend service protocol to H2C.

H2C support is also available for load balancers created using the GKE Gateway controller and Cloud Service Mesh.

H2C isn't supported for classic Application Load Balancers.

WebSocket support

Trusted Cloud HTTP(S)-based load balancers support the websocket protocol when you use HTTP or HTTPS as the protocol to the backend. The load balancer doesn't require any configuration to proxy websocket connections.

The websocket protocol provides a full-duplex communication channel between clients and the load balancer. For more information, see RFC 6455.

The websocket protocol works as follows:

  1. The load balancer recognizes a websocket Upgrade request from an HTTP or HTTPS client. The request contains the Connection: Upgrade and Upgrade: websocket headers, followed by other relevant websocket related request headers.
  2. Backend sends a websocket Upgrade response. The backend instance sends a 101 switching protocol response with Connection: Upgrade and Upgrade: websocket headers and other other websocket related response headers.
  3. The load balancer proxies bidirectional traffic for the duration of the current connection.

If the backend instance returns a status code 426 or 502, the load balancer closes the connection.

Websocket connection timeouts depend on the type of load balancer (global, regional, or classic). For details, see Backend service timeout.

Session affinity for websockets works the same as for any other request. For more information, see Session affinity.

gRPC support

gRPC is an open-source framework for remote procedure calls. It is based on the HTTP/2 standard. Use cases for gRPC include the following:

  • Low-latency, highly scalable, distributed systems
  • Developing mobile clients that communicate with a cloud server
  • Designing new protocols that must be accurate, efficient, and language-independent
  • Layered design to enable extension, authentication, and logging

To use gRPC with your Trusted Cloud applications, you must proxy requests end-to-end over HTTP/2. To do this, you create an Application Load Balancer with one of the following configurations:

  • For end-to-end unencrypted traffic (without TLS): you create an HTTP load balancer (configured with a target HTTP proxy). Additionally, you configure the load balancer to use HTTP/2 for unencrypted connections between the load balancer and its backends by setting the backend service protocol to H2C.

  • For end-to-end encrypted traffic (with TLS): you create an HTTPS load balancer (configured with a target HTTPS proxy and SSL certificate). The load balancer negotiates HTTP/2 with clients as part of the SSL handshake by using the ALPN TLS extension.

    Additionally, you must make sure that the backends can handle TLS traffic and configure the load balancer to use HTTP/2 for encrypted connections between the load balancer and its backends by setting the backend service protocol to HTTP2.

    The load balancer might still negotiate HTTPS with some clients or accept unsecured HTTP requests on a load balancer that is configured to use HTTP/2 between the load balancer and the backend instances. Those HTTP or HTTPS requests are transformed by the load balancer to proxy the requests over HTTP/2 to the backend instances.

If you want to configure an Application Load Balancer by using HTTP/2 with Google Kubernetes Engine Ingress or by using gRPC and HTTP/2 with Ingress, see HTTP/2 for load balancing with Ingress.

If you want to configure an external Application Load Balancer by using HTTP/2 with Cloud Run, see Use HTTP/2 behind a load balancer.

For information about troubleshooting problems with HTTP/2, see Troubleshooting issues with HTTP/2 to the backends.

For information about HTTP/2 limitations, see HTTP/2 limitations.

TLS support

By default, an HTTPS target proxy accepts only TLS 1.0, 1.1, 1.2, and 1.3 when terminating client SSL requests.

When the global external Application Load Balancer or the regional external Application Load Balancer use HTTPS as the backend service protocol, they can negotiate TLS 1.2 or 1.3 to the backend.

When the classic Application Load Balancer uses HTTPS as the backend service protocol, it can negotiate TLS 1.0, 1.1, 1.2, or 1.3 to the backend.

Mutual TLS support

Mutual TLS, or mTLS, is an industry standard protocol for mutual authentication between a client and a server. mTLS helps ensure that both the client and server authenticate each other by verifying that each holds a valid certificate issued by a trusted certificate authority (CA). Unlike standard TLS, where only the server is authenticated, mTLS requires both the client and server to present certificates, confirming the identities of both parties before communication is established.

All of the Application Load Balancers support mTLS. With mTLS, the load balancer requests that the client send a certificate to authenticate itself during the TLS handshake with the load balancer. You can configure a Certificate Manager trust store that the load balancer then uses to validate the client certificate's chain of trust.

For more information about mTLS, see Mutual TLS authentication.

TLS 1.3 early data support

TLS 1.3 early data is supported on the target HTTPS proxy of the following external Application Load Balancers for both HTTPS over TCP (HTTP/1.1, HTTP/2) and HTTP/3 over QUIC:

  • Global external Application Load Balancers
  • Classic Application Load Balancers

TLS 1.3 was defined in RFC 8446 and introduces the concept of early data, also known as zero-round-trip time (0-RTT) data, which can improve application performance for resumed connections by 30 to 50%.

With TLS 1.2, two round trips are required before data can be securely transmitted. TLS 1.3 reduces this to one round trip (1-RTT) for new connections, allowing clients to send application data immediately after the first server response. Additionally, TLS 1.3 introduces the concept of early data for resumed sessions, enabling clients to send application data with the initial ClientHello, thereby reducing the effective round-tip time to zero (0-RTT). TLS 1.3 early data allows the backend server to begin processing client data before the handshake process with the client is complete, thereby reducing latency; however, care must be taken to mitigate replay risks.

Because early data is sent before the handshake is complete, an attacker can attempt to capture and replay requests. To mitigate this, the backend server must carefully control early data usage, limiting its use to idempotent requests. HTTP methods that are intended to be idempotent but which might trigger nonidempotent changes—such as a GET request modifying a database—must not accept early data. In such cases, the backend server must reject requests with the HTTP Early-Data: 1 header by returning an HTTP 425 Too Early status code.

Requests with early data have the HTTP Early-Data header set to a value of 1, which indicates to the backend server that the request has been conveyed in TLS early data. It also indicates that the client understands the HTTP 425 Too Early status code.

TLS early data (0-RTT) modes

You can configure TLS early data using one of the following modes on the target HTTPS proxy resource of the load balancer.

  • STRICT. This enables TLS 1.3 early data for requests with safe HTTP methods (GET, HEAD, OPTIONS, TRACE), and HTTP requests that don't have query parameters. Requests that send early data containing nonidempotent HTTP methods (such as POST or PUT) or with query parameters are rejected with an HTTP 425 status code.

  • PERMISSIVE. This enables TLS 1.3 early data for requests with safe HTTP methods (GET, HEAD, OPTIONS, TRACE). This mode doesn't deny requests that include query parameters. The application owner must ensure that early data is safe to use for each request path, particularly for endpoints where request replay might cause unintended side effects, such as logging or database updates triggered by GET requests.

  • DISABLED. TLS 1.3 early data isn't advertised, and any (invalid) attempts to send early data are rejected. If your applications aren't equipped to handle early data requests safely, you must disable early data. TLS 1.3 early data is disabled by default.

  • UNRESTRICTED (not recommended for most workloads). This enables TLS 1.3 early data for requests with any HTTP method including nonidempotent methods, such as POST. This mode doesn't enforce any other limitations. This mode can be valuable for gRPC use cases. However, we don't recommend this method unless you have evaluated your security posture and mitigated the risk of replay attacks using other mechanisms.

Configure TLS early data

To explicitly enable or disable TLS early data, do the following:

Console

  1. In the Trusted Cloud console, go to the Load balancing page.

    Go to Load balancing

  2. Select the load balancer for which you need to enable early data.

  3. Click Edit.

  4. Click Frontend configuration.

  5. Select the frontend IP address and port that you want to edit. To enable TLS early data, the protocol must be HTTPS.

  6. In the Early data (0-RTT) list, select a TLS early data mode.

  7. Click Done.

  8. To update the load balancer, click Update.

gcloud

  1. Configure the TLS early data mode on the target HTTPS proxy of an Application Load Balancer.

    gcloud compute target-https-proxies update TARGET_HTTPS_PROXY \
      --tls-early-data=TLS_EARLY_DATA_MODE
    

    Replace the following:

    • TARGET_HTTPS_PROXY: the target HTTPS proxy of your load balancer
    • TLS_EARLY_DATA_MODE: STRICT, PERMISSIVE, DISABLED, or UNRESTRICTED

API

PATCH https://compute.googleapis.com/compute/v1/projects/{project}/global/targetHttpsProxies/TARGET_HTTPS_PROXY
{
    "tlsEarlyData":"TLS_EARLY_DATA_MODE",
    "fingerprint": "FINGERPRINT"
}

Replace the following:

  • TARGET_HTTPS_PROXY: the target HTTPS proxy of your load balancer
  • TLS_EARLY_DATA_MODE: STRICT, PERMISSIVE, DISABLED, or UNRESTRICTED
  • FINGERPRINT: a Base64 encoded string. An up-to-date fingerprint must be provided in order to patch the target HTTPS proxy; otherwise, the request fails with an HTTP 412 Precondition Failed status code.

After you have configured TLS early data, you can issue requests from an HTTP client that supports TLS early data. You can observe lower latency for resumed requests.

If a non-RFC-compliant client sends a request with a nonidempotent method or with query parameters, the request is denied. You see an HTTP 425 Early status code in the load balancer logs and the following HTTP response:

  HTTP/1.1 425 Too Early
  Content-Type: text/html; charset=UTF-8
  Referrer-Policy: no-referrer
  Content-Length: 1558
  Date: Thu, 03 Aug 2024 02:45:14 GMT
  Connection: close
  <!DOCTYPE html>
  <html lang=en>
  <title>Error 425 (Too Early)</title>
  <p>The request was sent to the server too early, please retry. That's
  all we know.</p>
  </html>
  

Limitations

  • HTTPS load balancers don't send a close_notify closure alert when terminating SSL connections. That is, the load balancer closes the TCP connection instead of performing an SSL shutdown.
  • HTTPS load balancers support only lowercase characters in domains in a common name (CN) attribute or a subject alternative name (SAN) attribute of the certificate. Certificates with uppercase characters in domains are returned only when set as the primary certificate in the target proxy.
  • HTTPS load balancers don't use the Server Name Indication (SNI) extension when connecting to the backend, except for load balancers with Internet NEG backends. For more information, see Encryption from the load balancer to the backends.
  • Trusted Cloud doesn't guarantee that an underlying TCP connection can remain open for the entirety of the value of the backend service timeout. Client systems must implement retry logic instead of relying on a TCP connection to be open for long periods of time.

  • You can't create a regional external Application Load Balancer in Premium Tier using the Trusted Cloud console. Instead, use either the gcloud CLI or the API.

What's next