This document introduces the concepts that you need to understand how to configure an external Application Load Balancer.
An external Application Load Balancer is a proxy-based Layer 7 load balancer that enables you to run and scale your services behind a single external IP address. The external Application Load Balancer distributes HTTP and HTTPS traffic to backends hosted on a variety of Trusted Cloud platforms (such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and so on), as well as external backends connected over the internet or via hybrid connectivity. For details, see Application Load Balancer overview: Use cases.
Modes of operation
This load balancer is available to you in the regional mode, and is hereafter referred to as a regional external Application Load Balancer. The load balancer is implemented as a managed service based on the open-source Envoy proxy. It includes advanced traffic management capabilities such as traffic mirroring, weight-based traffic splitting, request- or response-based header transformations, and more. The regional mode ensures that all clients and backends are in a specified region. Use this load balancer if you want to serve content from only one geolocation (for example, to meet compliance regulations).
Architecture
The following resources are required for an external Application Load Balancer deployment:
For regional external Application Load Balancers only, a proxy-only subnet is used to send connections from the load balancer to the backends.
An external forwarding rule specifies an external IP address, port, and target HTTP(S) proxy. Clients use the IP address and port to connect to the load balancer.
A target HTTP(S) proxy receives a request from the client. The HTTP(S) proxy evaluates the request by using the URL map to make traffic routing decisions. The proxy can also authenticate communications by using SSL certificates.
- For HTTPS load balancing, the target HTTPS proxy uses SSL certificates to prove its identity to clients. A target HTTPS proxy supports up to the documented number of SSL certificates.
The HTTP(S) proxy uses a URL map to make a routing determination based on HTTP attributes (such as the request path, cookies, or headers). Based on the routing decision, the proxy forwards client requests to specific backend services or backend buckets. The URL map can specify additional actions, such as sending redirects to clients.
A backend service distributes requests to healthy backends.
A health check periodically monitors the readiness of your backends. This reduces the risk that requests might be sent to backends that can't service the request.
Firewall rules for your backends to accept health check probes. Regional external Application Load Balancers require an additional firewall rule to allow traffic from the proxy-only subnet to reach the backends.
Regional
This diagram shows the components of a regional external Application Load Balancer deployment.
Proxy-only subnet
The proxy-only subnet provides a set of IP addresses that Google uses to run
Envoy proxies on your behalf. You must create one proxy-only subnet in each
region of a VPC network where you use regional external Application Load Balancers.
The --purpose
flag for this proxy-only subnet is set to
REGIONAL_MANAGED_PROXY
. All regional Envoy-based load
balancers in the same region
and VPC network share a pool of Envoy proxies from the same
proxy-only subnet. Further:
- Proxy-only subnets are only used for Envoy proxies, not your backends.
- Backend VMs or endpoints of all regional external Application Load Balancers in a region and VPC network receive connections from the proxy-only subnet.
- The IP address of the regional external Application Load Balancer is not located in the proxy-only subnet. The load balancer's IP address is defined by its external managed forwarding rule, which is described below.
If you previously created a proxy-only subnet with
--purpose=INTERNAL_HTTPS_LOAD_BALANCER
, you need to migrate the subnet's
purpose to
REGIONAL_MANAGED_PROXY
before you can create other Envoy-based load balancers
in the same region of the VPC
network.
Forwarding rules and IP addresses
Forwarding rules route traffic by IP address, port, and protocol to a load balancing configuration consisting of a target proxy, URL map, and one or more backend services.
IP address specification. Each forwarding rule provides a single IP address that can be used in DNS records for your application. No DNS-based load balancing is required. You can either specify the IP address to be used or let Cloud Load Balancing assign one for you.
Port specification. Each forwarding rule for an Application Load Balancer can reference a single port from 1-65535. To support multiple ports, you must configure multiple forwarding rules. You can configure multiple forwarding rules to use the same external IP address (VIP) and to reference the same target HTTP(S) proxy as long as the overall combination of IP address, port, and protocol is unique for each forwarding rule. This way, you can use a single load balancer with a shared URL map as a proxy for multiple applications.
The type of forwarding rule, IP address, and load balancing scheme used by external Application Load Balancers depends on the mode of the load balancer and which Network Service Tier the load balancer is in.
Load balancer mode | Network Service Tier | Forwarding rule, IP address, and load balancing scheme | Routing from the internet to the load balancer frontend |
---|---|---|---|
Regional external Application Load Balancer | Premium Tier |
Regional external forwarding rule Load balancing scheme: |
Requests reach Trusted Cloud at the PoP closest to the client. Requests are then routed over Trusted Cloud's premium backbone until they reach Envoy proxies in the same region as the load balancer. |
For the complete list of protocols supported by external Application Load Balancer forwarding rules in each mode, see Load balancer features.
Forwarding rules and VPC networks
This section describes how forwarding rules used by external Application Load Balancers are associated with VPC networks.
Load balancer mode | VPC network association |
---|---|
Regional external Application Load Balancer | The forwarding rule's VPC network is the network where the proxy-only subnet has been created. You specify the network when you create the forwarding rule. Depending on whether you use an IPv4 address or an IPv6 address range, there is always an explicit or implicit VPC network associated with the forwarding rule.
|
Target proxies
Target proxies terminate HTTP(S) connections from clients. One or more forwarding rules direct traffic to the target proxy, and the target proxy consults the URL map to determine how to route traffic to backends.
Do not rely on the proxy to preserve the case of request or response header
names. For example, a Server: Apache/1.0
response header might appear at the
client as server: Apache/1.0
.
The following table specifies the type of target proxy required by external Application Load Balancers.
Load balancer mode | Target proxy types | Proxy-added headers | Custom headers supported |
---|---|---|---|
Regional external Application Load Balancer | Regional HTTP, Regional HTTPS |
|
Configured in the URL map |
In addition to headers added by the target proxy, the load balancer adjusts other HTTP headers in the following ways:
- Some headers are coalesced. When there are multiple instances of the same
header key (for example,
Via
), the load balancer combines their values into a single comma-separated list for a single header key. Only the headers whose values can be represented as a comma-separated list are coalesced. Other headers, such asSet-Cookie
, are never coalesced.
Host header
When the load balancer makes the HTTP request, the load balancer preserves the Host header of the original request.
X-Forwarded-For header
The load balancer appends two IP addresses to the
X-Forwarded-For
header, separated by a single comma, in the following order:
- The IP address of the client that connects to the load balancer
- The IP address of the load balancer's forwarding rule
If the incoming request does not include an X-Forwarded-For
header,
the resulting header is as follows:
X-Forwarded-For: <client-ip>,<load-balancer-ip>
If the incoming request already includes an X-Forwarded-For
header,
the load balancer appends its values to the existing header:
X-Forwarded-For: <existing-value>,<client-ip>,<load-balancer-ip>
Remove existing header values using a custom request header
It is possible to remove existing header values by using custom request
headers on the
backend service. The following example uses the --custom-request-header
flag
to recreate the X-Forwarded-For
header by using the variables
client_ip_address
and server_ip_address
. This configuration replaces the
incoming X-Forwarded-For
header with only the client and the load balancer IP
address.
--custom-request-header=x-forwarded-for:{client_ip_address},{server_ip_address}
How backend reverse proxy software might modify the X-Forwarded-For
header
If your load balancer's backends run HTTP reverse proxy software,
the software might append one or both of the following IP addresses to the
end of the X-Forwarded-For
header:
The IP address of the GFE that connected to the backend. GFE IP addresses are in the
130.211.0.0/22
and35.191.0.0/16
ranges.The IP address of the backend system itself.
As a result, an upstream system might see an X-Forwarded-For
header
structured as follows:
<existing-value>,<client-ip>,<load-balancer-ip>,<GFE-ip>,<backend-ip>
Cloud Trace support
Trace is not supported with Application Load Balancers. The global
and classic Application Load Balancers add the X-Cloud-Trace-Context
header if it is not
present. The regional external Application Load Balancer does not add this header. If the
X-Cloud-Trace-Context
header is already present, it passes through the load
balancers unmodified. However, no traces or spans are exported by the load
balancer.
URL maps
URL maps define matching patterns for URL-based routing of requests to the appropriate backend services. The URL map allows you to divide your traffic by examining the URL components to send requests to different sets of backends. A default service is defined to handle any requests that do not match a specified host rule or path matching rule.
URL maps support several advanced traffic management features such as header-based traffic steering, weight-based traffic splitting, and request mirroring. For more information, see the following:
The following table specifies the type of URL map required by external Application Load Balancers in each mode.
Load balancer mode | URL map type |
---|---|
Regional external Application Load Balancer | Regional |
SSL certificates
External Application Load Balancers using target HTTPS proxies require private keys and SSL certificates as part of the load balancer configuration.
Regional external Application Load Balancers using target HTTPS proxies require private keys and SSL certificates as part of the load balancer configuration.
Regional external Application Load Balancers support self-managed Compute Engine SSL certificates.
SSL policies
SSL policies specify the set of SSL features that Trusted Cloud load balancers use when negotiating SSL with clients.
By default, HTTPS Load Balancing uses a set of SSL features that provides good security and wide compatibility. Some applications require more control over which SSL versions and ciphers are used for their HTTPS or SSL connections. You can define an SSL policy to specify the set of SSL features that your load balancer uses when negotiating SSL with clients. In addition, you can apply that SSL policy to your target HTTPS proxy.
The following table specifies the SSL policy support for load balancers in each mode.
Load balancer mode | SSL policies supported |
---|---|
Regional external Application Load Balancer |
Backend services
A backend service provides configuration information to the load balancer so that it can direct requests to its backends—for example, Compute Engine instance groups or network endpoint groups (NEGs). For more information about backend services, see Backend services overview.
Backend service scope
The following table indicates which backend service resource and scope is used by external Application Load Balancers:
Load balancer mode | Backend service resource |
---|---|
Regional external Application Load Balancer |
regionBackendServices (regional) |
Protocol to the backends
Backend services for Application Load Balancers must use one of the following protocols to send requests to backends:
- HTTP, which uses HTTP/1.1 and no TLS
- HTTPS, which uses HTTP/1.1 and TLS
- HTTP/2, which uses HTTP/2 and TLS (HTTP/2 without encryption isn't supported.)
- H2C, which uses HTTP/2 over TCP. TLS isn't required. H2C isn't supported for classic Application Load Balancers.
The load balancer only uses the backend service protocol that you specify to communicate with its backends. The load balancer doesn't fall back to a different protocol if it is unable to communicate with backends using the specified backend service protocol.
The backend service protocol doesn't need to match the protocol used by clients to communicate with the load balancer. For example, clients can send requests to the load balancer using HTTP/2, but the load balancer can communicate with backends using HTTP/1.1 (HTTP or HTTPS).
Backends
A regional external Application Load Balancer supports the following types of backends:
- Instance groups
- Zonal NEGs
- Internet NEGs
Backends and VPC networks
For regional external Application Load Balancer backends, the following applies:
For instance groups, zonal NEGs, and hybrid connectivity NEGs, all backends must be located in the same project and region as the backend service. However, a load balancer can reference a backend that uses a different VPC network in the same project as the backend service. Connectivity between the load balancer's VPC network and the backend VPC network can be configured using either VPC Network Peering, Cloud VPN tunnels, or Cloud Interconnect VLAN attachments.
Backend network definition
- For zonal NEGs and hybrid NEGs, you explicitly specify the VPC network when you create the NEG.
- For managed instance groups, the VPC network is defined in the instance template.
- For unmanaged instance groups, the instance group's
VPC network is set to match the VPC network
of the
nic0
interface for the first VM added to the instance group.
Backend network requirements
Your backend's network must satisfy one of the following network requirements:
The backend's VPC network must exactly match the forwarding rule's VPC network.
The backend's VPC network must be connected to the forwarding rule's VPC network using VPC Network Peering. You must configure subnet route exchanges to allow communication between the proxy-only subnet in the forwarding rule's VPC network and the subnets used by the backend instances or endpoints.
For all other backend types, all backends must be located in the same VPC network and region.
Regional external Application Load Balancers also support Shared VPC environments where you can share VPC networks and their associated resources across projects. If you want the regional external Application Load Balancer's backend service and backends to be in a different project from the forwarding rule, you need to configure the load balancer in a Shared VPC environment with cross-project service referencing.
Backends and network interfaces
If you use instance group backends, packets are always delivered to nic0
. If
you want to send packets to non-nic0
interfaces (either vNICs or
Dynamic Network Interfaces), use
NEG backends instead.
If you use zonal NEG backends, packets are sent to whatever network interface is represented by the endpoint in the NEG. The NEG endpoints must be in the same VPC network as the NEG's explicitly defined VPC network.
Health checks
Each backend service specifies a health check that periodically monitors the backends' readiness to receive a connection from the load balancer. This reduces the risk that requests might be sent to backends that can't service the request. Health checks do not check if the application itself is working.
For the health check probes, you must create an ingress allow firewall rule that allows health check probes to reach your backend instances. Typically, health check probes originate from Google's centralized health checking mechanism.
Regional external Application Load Balancers that use hybrid NEG backends are an exception to this rule because their health checks originate from the proxy-only subnet instead. For details, see the Hybrid NEGs overview.
Health check protocol
Although it is not required and not always possible, it is a best practice to use a health check whose protocol matches the protocol of the backend service. For example, an HTTP/2 health check most accurately tests HTTP/2 connectivity to backends. In contrast, regional external Application Load Balancers that use hybrid NEG backends do not support gRPC health checks. For the list of supported health check protocols, see Load balancing features.
The following table specifies the scope of health checks supported by external Application Load Balancers in each mode.
Load balancer mode | Health check type |
---|---|
Regional external Application Load Balancer | Regional |
For more information about health checks, see the following:
Firewall rules
The load balancer requires the following firewall rules:
- For the regional external Application Load Balancer, an ingress allow rule to permit traffic from the proxy-only subnet to reach your backends.
- An ingress allow rule to permit traffic from the health check probes ranges. For more information about health check probes and why it's necessary to allow traffic from them, see Probe IP ranges and firewall rules.
Firewall rules are implemented at the VM instance level, not on GFE proxies. You cannot use Trusted Cloud firewall rules to prevent traffic from reaching the load balancer.
The ports for these firewall rules must be configured as follows:
Allow traffic to the destination port for each backend service's health check.
For instance group backends: Determine the ports to be configured by the mapping between the backend service's named port and the port numbers associated with that named port on each instance group. The port numbers can vary among instance groups assigned to the same backend service.
For
GCE_VM_IP_PORT
NEG backends: Allow traffic to the port numbers of the endpoints.
GKE support
GKE uses external Application Load Balancers in the following ways:
- External Gateways created using the GKE Gateway
controller can use any mode of
an external Application Load Balancer. You control the load balancer's mode by choosing a
GatewayClass. The
GKE Gateway controller always uses
GCE_VM_IP_PORT
zonal NEG backends.
- You can use
GCE_VM_IP_PORT
zonal NEG created and managed by GKE Services as backends for any Application Load Balancer or Proxy Network Load Balancer. For more information, see Container-native load balancing through standalone zonal NEGs.
Shared VPC architecture
External Application Load Balancers support networks that use Shared VPC. Shared VPC lets organizations connect resources from multiple projects to a common VPC network so that they can communicate with each other securely and efficiently by using internal IP addresses from that network. If you're not already familiar with Shared VPC, read the Shared VPC overview.
There are many ways to configure an external Application Load Balancer within a Shared VPC network. Regardless of type of deployment, all the components of the load balancer must be in the same organization.
Load balancer | Frontend components | Backend components |
---|---|---|
Regional external Application Load Balancer | Create the required network and proxy-only subnet in the Shared VPC host project. The regional external IP address, the forwarding rule, the target HTTP(S) proxy, and the associated URL map must be defined in the same project. This project can be the host project or a service project. |
You can do one of the following:
Each backend service must be defined in the same project as the backends it references. Health checks associated with backend services must be defined in the same project as the backend service as well. |
While you can create all the load balancing components and backends in the Shared VPC host project, this type of deployment does not separate network administration and service development responsibilities.
All load balancer components and backends in a service project
The following architecture diagram shows a standard Shared VPC deployment where all load balancer components and backends are in a service project. This deployment type is supported by all Application Load Balancers.
The load balancer components and backends must use the same VPC network.
Cross-project service referencing
Cross-project service referencing is a deployment model where the load balancer's frontend and URL map are in one project and the load balancer's backend service and backends are in a different project.
Cross-project service referencing lets organizations configure one central load balancer and route traffic to hundreds of services distributed across multiple different projects. You can centrally manage all traffic routing rules and policies in one URL map. You can also associate the load balancer with a single set of hostnames and SSL certificates. You can therefore optimize the number of load balancers needed to deploy your application, and lower manageability, operational costs, and quota requirements.
By having different projects for each of your functional teams, you can also achieve separation of roles within your organization. Service owners can focus on building services in service projects, while network teams can provision and maintain load balancers in another project, and both can be connected by using cross-project service referencing.
Service owners can maintain autonomy over the exposure of their services and
control which users can access their services by using the load balancer. This is
achieved by a special IAM role called the
Compute Load Balancer Services User role
(roles/compute.loadBalancerServiceUser
).
Cross-project service referencing support differs based on the type of load balancer:
For global external Application Load Balancers: your load balancer's frontend and URL map can reference backend services or backend buckets from any project within the same organization. No VPC network restrictions apply. While you can use a Shared VPC environment to configure a cross-project deployment as shown in this example, this isn't a requirement.
For regional external Application Load Balancers: you must create the load balancer in a Shared VPC environment. The load balancer's frontend and URL map must be in a host or service project, and the load balancer's backend services and backends can be distributed across host or service projects in the same Shared VPC environment.
To learn how to configure Shared VPC for a regional external Application Load Balancer—with and without cross-project service referencing—see Set up a regional external Application Load Balancer with Shared VPC.
Usage notes for cross-project service referencing
- Trusted Cloud doesn't differentiate between resources (for example, backend services) using the same name across multiple projects. Therefore, when you are using cross-project service referencing, we recommend that you use unique backend service names across projects within your organization.
- If you see an error such as "Cross-project references for this resource
are not allowed", make sure that you have the permission to use the
resource. An administrator of the project that owns the resource must grant
you the
Compute
Load Balancer Services User role
(
roles/compute.loadBalancerServiceUser
). This role can be granted at the project level or at the resource level. For an example, see Grant permissions to the Compute Load Balancer Admin to use the backend service or backend bucket.
Example 1: Load balancer frontend and backend in different service projects
Here is an example of a Shared VPC deployment where the load balancer's frontend and URL map are created in service project A and the URL map references a backend service in service project B.
In this case, Network Admins or Load Balancer Admins in service project A
require access to backend services in service project B. Service project B
admins grant the Compute Load Balancer Services User role
(roles/compute.loadBalancerServiceUser
) to Load Balancer Admins in
service project A who want to reference the backend
service in service project B.
Example 2: Load balancer frontend in the host project and backends in service projects
Here is an example of a Shared VPC deployment where the load balancer's frontend and URL map are created in the host project and the backend services (and backends) are created in service projects.
In this case, Network Admins or Load Balancer Admins in the host project
require access to backend services in the service project. Service project
admins grant the Compute Load Balancer Services User role
(roles/compute.loadBalancerServiceUser
) to
to Load Balancer Admins in the host project A who want to reference the backend
service in the service project.
Example 3: Load balancer frontend and backends in different projects
Here is an example of a deployment where the global external Application Load Balancer's frontend and URL map are created in a different project from the load balancer's backend service and backends. This type of deployment doesn't use Shared VPC and is supported only for global external Application Load Balancers.
To learn more about this setup, see Set up cross-project service referencing with a backend service and a backend bucket.
How connections work
Regional external Application Load Balancer connections
The regional external Application Load Balancer is a managed service implemented on the Envoy proxy.
The regional external Application Load Balancer uses a shared subnet called a proxy-only subnet to
provision a set of IP addresses that Google uses to run Envoy proxies on your
behalf. The --purpose
flag for this proxy-only subnet is set to
REGIONAL_MANAGED_PROXY
. All regional Envoy-based load
balancers in a particular
network and region share this subnet.
Clients use the load balancer's IP address and port to connect to the load balancer. Client requests are directed to the proxy-only subnet in the same region as the client. The load balancer terminates clients requests and then opens new connections from the proxy-only subnet to your backends. Therefore, packets sent from the load balancer have source IP addresses from the proxy-only subnet.
Depending on the backend service configuration, the protocol used by Envoy proxies to connect to your backends can be HTTP, HTTPS, or HTTP/2. If HTTP or HTTPS, the HTTP version is HTTP 1.1. HTTP keepalive is enabled by default, as specified in the HTTP 1.1 specification. The Envoy proxy sets both the client HTTP keepalive timeout and the backend keepalive timeout to a default value of 600 seconds each. You can update the client HTTP keepalive timeout but the backend keepalive timeout value is fixed. You can configure the request/response timeout by setting the backend service timeout. For more information, see timeouts and retries.
Client communications with the load balancer
- Clients can communicate with the load balancer by using the HTTP 1.1 or HTTP/2 protocol.
- When HTTPS is used, modern clients default to HTTP/2. This is controlled on the client, not on the HTTPS load balancer.
- You cannot disable HTTP/2 by making a configuration change on the load
balancer. However, you can configure some clients to use HTTP 1.1 instead of
HTTP/2. For example, with
curl
, use the--http1.1
parameter. - External Application Load Balancers support the
HTTP/1.1 100 Continue
response.
For the complete list of protocols supported by external Application Load Balancer forwarding rules in each mode, see Load balancer features.
Source IP addresses for client packets
The source IP address for packets, as seen by the backends, is not the Trusted Cloud external IP address of the load balancer. In other words, there are two TCP connections.
For the regional external Application Load Balancers:Connection 1, from original client to the load balancer (proxy-only subnet):
- Source IP address: the original client (or external IP address if the client is behind a NAT gateway or a forward proxy).
- Destination IP address: your load balancer's IP address.
Connection 2, from the load balancer (proxy-only subnet) to the backend VM or endpoint:
Source IP address: an IP address in the proxy-only subnet that is shared among all the Envoy-based load balancers deployed in the same region and network as the load balancer.
Destination IP address: the internal IP address of the backend VM or container in the VPC network.
Special routing paths
Trusted Cloud uses special routes not defined in your VPC network to route packets for the following types of traffic:
- For health checks, except distributed Envoy health checks. For more information, see Paths for health checks.
Trusted Cloud uses subnet routes for proxy-only subnets to route packets for the following types of traffic:
- When using distributed Envoy health checks.
For regional external Application Load Balancers, Trusted Cloud uses open-source Envoy proxies to terminate client requests to the load balancer. The load balancer terminates the TCP session and opens a new TCP session from the region's proxy- only subnet to your backend. Routes defined within your VPC network facilitate communication from Envoy proxies to your backends and from your backends to the Envoy proxies.
TLS termination
The following table summarizes how TLS termination is handled by external Application Load Balancers.
Load balancer mode | TLS termination |
---|---|
Regional external Application Load Balancer | TLS is terminated on Envoy proxies located in a proxy-only subnet in a region chosen by the user. Use this load balancer mode if you need geographic control over the region where TLS is terminated. |
Timeouts and retries
External Application Load Balancers support the following types of timeouts for HTTP or HTTPS traffic:
Timeout type and description | Default values | Supports custom timeout values | ||
---|---|---|---|---|
Backend service timeout1
A request and response timeout. Represents the maximum amount of time allowed between the load balancer sending the first byte of a request to the backend and the backend returning the last byte of the HTTP response to the load balancer. If the backend hasn't returned the entire HTTP response to the load balancer within this time limit, the remaining response data is dropped. |
|
|||
Client HTTP keepalive timeout
The maximum amount of time that the TCP connection between a client and the load balancer's proxy can be idle. (The same TCP connection might be used for multiple HTTP requests.)
|
610 seconds | |||
Backend HTTP keepalive timeout
The maximum amount of time that the TCP connection between the load balancer's proxy and a backend can be idle. (The same TCP connection might be used for multiple HTTP requests.)
|
|
1Not configurable for serverless NEG backends. Not configurable for backend buckets.
Backend service timeout
The configurable backend service timeout represents the maximum amount of time that the load balancer waits for your backend to process an HTTP request and return the corresponding HTTP response. Except for serverless NEGs, the default value for the backend service timeout is 30 seconds.
For example, if you want to download a 500-MB file, and the value of the backend service timeout is 90 seconds, the load balancer expects the backend to deliver the entire 500-MB file within 90 seconds. It is possible to configure the backend service timeout to be insufficient for the backend to send its complete HTTP response. In this situation, if the load balancer has at least received HTTP response headers from the backend, the load balancer returns the complete response headers and as much of the response body as it could obtain within the backend service timeout.
We recommend that you set the backend service timeout to the longest amount of
time that you expect your backend to need in order to process an HTTP response.
If the software running on your backend needs more time to process an HTTP
request and return its entire response, we recommend that you increase the
backend service timeout.
For example, we recommend that you increase the
timeout if you see HTTP 408
status code responses with
jsonPayload.statusDetail client_timed_out
errors.
The backend service timeout accepts values between 1
and 2,147,483,647
seconds; however, larger values aren't practical configuration options.
Trusted Cloud also doesn't guarantee that an underlying TCP connection can
remain open for the entirety of the value of the backend service timeout.
Client systems must implement retry logic instead of relying on a TCP
connection to be open for long periods of time.
To configure the backend service timeout, use one of the following methods:
Console
Modify the Timeout field of the load balancer's backend service.
gcloud
Use the
gcloud compute backend-services update
command
to modify the --timeout
parameter of the backend service
resource.
API
For a regional external Application Load Balancer, modify the timeoutSec
parameter
for the
regionBackendServices
resource.
Load balancer mode | Default values | Timeout description for websockets |
---|---|---|
Regional external Application Load Balancer | backend service timeout: 30 seconds | Active websocket connections don't use the backend service timeout of the load balancer. Idle websocket connections are closed after the backend service times out. Trusted Cloud periodically restarts or changes the number of serving Envoy software tasks. The longer the backend service timeout value is, the more likely it is that Envoy tasks restart or terminate TCP connections. |
Regional external Application Load Balancers use the configured
routeActions.timeout
parameter of the URL maps and ignores the backend service timeout. When
routeActions.timeout
isn't configured, the value of the backend
service timeout is used. When routeActions.timeout
is supplied,
the backend service timeout is ignored, and the value of
routeActions.timeout
is used as the request and response timeout
instead.
Client HTTP keepalive timeout
The client HTTP keepalive timeout represents the maximum amount of time that a TCP connection can be idle between the (downstream) client and one of the following types of proxies:
- For a regional external Application Load Balancer: an Envoy proxy
The client HTTP keepalive timeout represents the TCP idle timeout for the underlying TCP connections. The client HTTP keepalive timeout doesn't apply to websockets.
The default value for the client HTTP keepalive timeout is 610 seconds. For global and regional external Application Load Balancers, you can configure the client HTTP keepalive timeout with a value between 5 and 1200 seconds.
To configure the client HTTP keepalive timeout, use one of the following methods:
Console
Modify the HTTP keepalive timeout field of the load balancer's frontend configuration.
gcloud
For global external Application Load Balancers, use the
gcloud compute target-http-proxies update
command
or the gcloud compute target-https-proxies update
command
to modify the --http-keep-alive-timeout-sec
parameter
of the target HTTP proxy or the target HTTPS proxy resource.
For a regional external Application Load Balancer, you cannot update the keepalive timeout parameter of a regional target HTTP(S) proxy directly. To update the keepalive timeout parameter of a regional target proxy, you need to do the following:
- Create a new target proxy with the intended timeout settings.
- Mirror all other settings from the current target proxy on the new one. For target HTTPS proxies, this includes linking any SSL certificates or certificate maps to the new target proxy.
- Update the forwarding rules to point to the new target proxy.
- Delete the previous target proxy.
API
For global external Application Load Balancers, modify the
httpKeepAliveTimeoutSec
parameter for the
targetHttpProxies
resource or the
targetHttpsProxies
resource.
For a regional external Application Load Balancer, you cannot update the keepalive timeout parameter of a regional target HTTP(S) proxy directly. To update the keepalive timeout parameter of a regional target proxy, you need to do the following:
- Create a new target proxy with the intended timeout settings.
- Mirror all other settings from the current target proxy on the new one. For target HTTPS proxies, this includes linking any SSL certificates or certificate maps to the new target proxy.
- Update the forwarding rules to point to the new target proxy.
- Delete the previous target proxy.
The load balancer's client HTTP keepalive timeout must be greater than the HTTP keepalive (TCP idle) timeout used by downstream clients or proxies. If a downstream client has a greater HTTP keepalive (TCP idle) timeout than the load balancer's client HTTP keepalive timeout, it's possible for a race condition to occur. From the perspective of a downstream client, an established TCP connection is permitted to be idle for longer than permitted by the load balancer. This means that the downstream client can send packets after the load balancer considers the TCP connection to be closed. When that happens, the load balancer responds with a TCP reset (RST) packet.
When the client HTTP keepalive timeout expires, either the GFE or the Envoy proxy sends a TCP FIN to the client to gracefully close the connection.
Backend HTTP keepalive timeout
External Application Load Balancers are proxies that use at least two TCP connections:
- For a regional external Application Load Balancer, a first TCP connection exists between the (downstream) client and an Envoy proxy. The Envoy proxy then opens a second TCP connection to your backends.
The load balancer's secondary TCP connections might not get closed after each request; they can stay open to handle multiple HTTP requests and responses. The backend HTTP keepalive timeout defines the TCP idle timeout between the load balancer and your backends. The backend HTTP keepalive timeout doesn't apply to websockets.
The backend keepalive timeout is fixed at 10 minutes (600 seconds) and cannot be changed. This helps ensure that the load balancer maintains idle connections for at least 10 minutes. After this period, the load balancer can send termination packets to the backend at any time.
The load balancer's backend keepalive timeout must be less than the keepalive timeout used by software running on your backends. This avoids a race condition where the operating system of your backends might close TCP connections with a TCP reset (RST). Because the backend keepalive timeout for the load balancer isn't configurable, you must configure your backend software so that its HTTP keepalive (TCP idle) timeout value is greater than 600 seconds.
When the backend HTTP keepalive timeout expires, either the GFE or the Envoy proxy sends a TCP FIN to the backend VM to gracefully close the connection.
The following table lists the changes necessary to modify keepalive timeout values for common web server software.
Web server software | Parameter | Default setting | Recommended setting |
---|---|---|---|
Apache | KeepAliveTimeout | KeepAliveTimeout 5 |
KeepAliveTimeout 620 |
nginx | keepalive_timeout | keepalive_timeout 75s; |
keepalive_timeout 620s; |
Retries
Support for retry logic depends on the mode of the external Application Load Balancer.
Load balancer mode | Retry logic |
---|---|
Regional external Application Load Balancer |
Configurable by using a
retry
policy in the URL map. The default number of retries
( Without a retry policy, unsuccessful requests that have no
HTTP body (for example, HTTP Retried requests only generate one log entry for the final response. |
The WebSocket protocol is supported with GKE Ingress.
Illegal request and response handling
The load balancer blocks both client requests and backend responses from reaching the backend or the client, respectively, for a number of reasons. Some reasons are strictly for HTTP/1.1 compliance and others are to avoid unexpected data being passed to or from the backends. None of the checks can be disabled.
The load balancer blocks the following requests for HTTP/1.1 compliance:
- It cannot parse the first line of the request.
- A header is missing the colon (
:
) delimiter. - Headers or the first line contain invalid characters.
- The content length is not a valid number, or there are multiple content length headers.
- There are multiple transfer encoding keys, or there are unrecognized transfer encoding values.
- There's a non-chunked body and no content length specified.
- Body chunks are unparseable. This is the only case where some data reaches the backend. The load balancer closes the connections to the client and backend when it receives an unparseable chunk.
Request handling
The load balancer blocks the request if any of the following are true:
- The total size of request headers and the request URL exceeds the limit for the maximum request header size for external Application Load Balancers.
- The request method doesn't allow a body, but the request has one.
- The request contains an
Upgrade
header, and theUpgrade
header isn't used to enable WebSocket connections. - The HTTP version is unknown.
Response handling
The load balancer blocks the backend's response if any of the following are true:
- The total size of response headers exceeds the limit for maximum response header size for external Application Load Balancers.
- The HTTP version is unknown.
When handling both the request and response, the load balancer might remove or overwrite hop-by-hop headers in HTTP/1.1 before forwarding them to the intended destination.
Traffic distribution
When you add a backend instance group or NEG to a backend service, you specify a balancing mode, which defines a method measuring backend load and a target capacity. External Application Load Balancers support two balancing modes:
RATE
, for instance groups or NEGs, is the target maximum number of requests (queries) per second (RPS, QPS). The target maximum RPS/QPS can be exceeded if all backends are at or above capacity.UTILIZATION
is the backend utilization of VMs in an instance group.
How traffic is distributed among backends depends on the mode of the load balancer.
Regional external Application Load Balancer
For regional external Application Load Balancers, traffic distribution is based on the load balancing mode and the load balancing locality policy.
The balancing mode determines the weight and fraction of traffic to send to
each group (instance group or NEG). The load balancing locality policy
(LocalityLbPolicy
) determines how backends within the group are load balanced.
When a backend service receives traffic, it first directs traffic to a backend (instance group or NEG) according to the backend's balancing mode. After a backend is selected, traffic is then distributed among instances or endpoints in that backend group according to the load balancing locality policy.
For more information, see the following:
Session affinity
Session affinity provides a best-effort attempt to send requests from a particular client to the same backend for as long as the backend is healthy and has the capacity, according to the configured balancing mode.
When you use session affinity, we recommend the RATE
balancing mode rather
than UTILIZATION
. Session affinity works best if you set the balancing mode to
requests per second (RPS).
External Application Load Balancers offer the following types of session affinity:
- NONE. Session affinity isn't set for the load balancer.
- Client IP affinity
- Generated cookie affinity
- Header field affinity
- HTTP Cookie affinity
- Stateful cookie-based session affinity
The following table summarizes the supported session affinity options supported by external Application Load Balancers:
Load balancer mode | Session affinity options | ||||||
---|---|---|---|---|---|---|---|
None | Client IP | Generated cookie | Header field | HTTP cookie | Stateful cookie | ||
Regional external Application Load Balancer |
High availability and failover
High availability and failover in external Application Load Balancers can be configured at the load balancer level. This is handled by creating backup regional external Application Load Balancers in any region where you require backup.
The following table describes the failover behavior.
Load balancer mode | Failover methods |
---|---|
Regional external Application Load Balancer | Use one of the following methods to ensure a highly available deployment:
|
HTTP/2 support
HTTP/2 is a major revision of the HTTP/1 protocol. There are 2 modes of HTTP/2 support:
- HTTP/2 over TLS
- Cleartext HTTP/2 over TCP
HTTP/2 over TLS
HTTP/2 over TLS is supported for connections between clients and the external Application Load Balancer, and for connections between the load balancer and its backends.
The load balancer automatically negotiates HTTP/2 with clients as part of the TLS handshake by using the ALPN TLS extension. Even if a load balancer is configured to use HTTPS, modern clients default to HTTP/2. This is controlled on the client, not on the load balancer.
If a client doesn't support HTTP/2 and the load balancer is configured to use HTTP/2 between the load balancer and the backend instances, the load balancer might still negotiate an HTTPS connection or accept unsecured HTTP requests. Those HTTPS or HTTP requests are then transformed by the load balancer to proxy the requests over HTTP/2 to the backend instances.
To use HTTP/2 over TLS, you must enable TLS on your backends and set the
backend service protocol to
HTTP2
. For
more information, see Encryption from the load balancer to the
backends.
HTTP/2 max concurrent streams
The HTTP/2
SETTINGS_MAX_CONCURRENT_STREAMS
setting describes the maximum number of streams that an endpoint accepts,
initiated by the peer. The value advertised by an HTTP/2 client to a
Trusted Cloud load balancer is effectively meaningless because the load
balancer doesn't initiate streams to the client.
In cases where the load balancer uses HTTP/2 to communicate with a server that
is running on a VM, the load balancer respects the
SETTINGS_MAX_CONCURRENT_STREAMS
value advertised by the server. If a value of
zero is advertised, the load balancer can't forward requests to the server, and
this might result in errors.
HTTP/2 limitations
- HTTP/2 between the load balancer and the instance can require significantly more TCP connections to the instance than HTTP or HTTPS. Connection pooling, an optimization that reduces the number of these connections with HTTP or HTTPS, isn't available with HTTP/2. As a result, you might see high backend latencies because backend connections are made more frequently.
- HTTP/2 between the load balancer and the backend doesn't support running the WebSocket Protocol over a single stream of an HTTP/2 connection (RFC 8441).
- HTTP/2 between the load balancer and the backend doesn't support server push.
- The gRPC error rate and request volume aren't visible in the
Trusted Cloud API or the Trusted Cloud console. If the gRPC endpoint
returns an error, the load balancer logs and the monitoring data report the
200 OK
HTTP status code.
Cleartext HTTP/2 over TCP (H2C)
Cleartext HTTP/2 over TCP, also known as H2C, lets you use HTTP/2 without TLS. H2C is supported for both of the following connections:
- Connections between clients and the load balancer. No special configuration is required.
Connections between the load balancer and its backends.
To configure H2C for connections between the load balancer and its backends, you set the backend service protocol to
H2C
.
H2C support is also available for load balancers created using the GKE Gateway controller and Cloud Service Mesh.
H2C isn't supported for classic Application Load Balancers.
WebSocket support
Trusted Cloud HTTP(S)-based load balancers support the websocket protocol when you use HTTP or HTTPS as the protocol to the backend. The load balancer doesn't require any configuration to proxy websocket connections.
The websocket protocol provides a full-duplex communication channel between clients and the load balancer. For more information, see RFC 6455.
The websocket protocol works as follows:
- The load balancer recognizes a websocket
Upgrade
request from an HTTP or HTTPS client. The request contains theConnection: Upgrade
andUpgrade: websocket
headers, followed by other relevant websocket related request headers. - Backend sends a websocket
Upgrade
response. The backend instance sends a101 switching protocol
response withConnection: Upgrade
andUpgrade: websocket
headers and other other websocket related response headers. - The load balancer proxies bidirectional traffic for the duration of the current connection.
If the backend instance returns a status code 426
or 502
,
the load balancer closes the connection.
Session affinity for websockets works the same as for any other request. For more information, see Session affinity.
gRPC support
gRPC is an open-source framework for remote procedure calls. It is based on the HTTP/2 standard. Use cases for gRPC include the following:
- Low-latency, highly scalable, distributed systems
- Developing mobile clients that communicate with a cloud server
- Designing new protocols that must be accurate, efficient, and language-independent
- Layered design to enable extension, authentication, and logging
To use gRPC with your Trusted Cloud applications, you must proxy requests end-to-end over HTTP/2. To do this, you create an Application Load Balancer with one of the following configurations:
For end-to-end unencrypted traffic (without TLS): you create an HTTP load balancer (configured with a target HTTP proxy). Additionally, you configure the load balancer to use HTTP/2 for unencrypted connections between the load balancer and its backends by setting the backend service protocol to
H2C
.For end-to-end encrypted traffic (with TLS): you create an HTTPS load balancer (configured with a target HTTPS proxy and SSL certificate). The load balancer negotiates HTTP/2 with clients as part of the SSL handshake by using the ALPN TLS extension.
Additionally, you must make sure that the backends can handle TLS traffic and configure the load balancer to use HTTP/2 for encrypted connections between the load balancer and its backends by setting the backend service protocol to
HTTP2
.The load balancer might still negotiate HTTPS with some clients or accept unsecured HTTP requests on a load balancer that is configured to use HTTP/2 between the load balancer and the backend instances. Those HTTP or HTTPS requests are transformed by the load balancer to proxy the requests over HTTP/2 to the backend instances.
If you want to configure an Application Load Balancer by using HTTP/2 with Google Kubernetes Engine Ingress or by using gRPC and HTTP/2 with Ingress, see HTTP/2 for load balancing with Ingress.
If you want to configure an external Application Load Balancer by using HTTP/2 with Cloud Run, see Use HTTP/2 behind a load balancer.
For information about troubleshooting problems with HTTP/2, see Troubleshooting issues with HTTP/2 to the backends.
For information about HTTP/2 limitations, see HTTP/2 limitations.
TLS support
By default, an HTTPS target proxy accepts only TLS 1.0, 1.1, 1.2, and 1.3 when terminating client SSL requests.
When the global external Application Load Balancer or the regional external Application Load Balancer use HTTPS as the backend service protocol, they can negotiate TLS 1.2 or 1.3 to the backend.When the classic Application Load Balancer uses HTTPS as the backend service protocol, it can negotiate TLS 1.0, 1.1, 1.2, or 1.3 to the backend.
Mutual TLS support
Mutual TLS, or mTLS, is an industry standard protocol for mutual authentication between a client and a server. mTLS helps ensure that both the client and server authenticate each other by verifying that each holds a valid certificate issued by a trusted certificate authority (CA). Unlike standard TLS, where only the server is authenticated, mTLS requires both the client and server to present certificates, confirming the identities of both parties before communication is established.
All of the Application Load Balancers support mTLS. With mTLS, the load balancer requests that the client send a certificate to authenticate itself during the TLS handshake with the load balancer. You can configure a Certificate Manager trust store that the load balancer then uses to validate the client certificate's chain of trust.
For more information about mTLS, see Mutual TLS authentication.
TLS 1.3 early data support
TLS 1.3 early data is supported on the target HTTPS proxy of the following external Application Load Balancers for both HTTPS over TCP (HTTP/1.1, HTTP/2) and HTTP/3 over QUIC:
- Global external Application Load Balancers
- Classic Application Load Balancers
TLS 1.3 was defined in RFC 8446 and introduces the concept of early data, also known as zero-round-trip time (0-RTT) data, which can improve application performance for resumed connections by 30 to 50%.
With TLS 1.2, two round trips are required before data can be securely
transmitted. TLS 1.3 reduces this to one round trip (1-RTT) for new connections,
allowing clients to send application data immediately after the first server
response. Additionally, TLS 1.3 introduces the concept of early data for resumed
sessions, enabling clients to send application data with the initial
ClientHello
, thereby reducing the effective round-tip time to zero (0-RTT).
TLS 1.3 early data allows the backend server to begin processing client data
before the handshake process with the client is complete, thereby reducing
latency; however, care must be taken to mitigate replay risks.
Because early data is sent before the handshake is complete, an attacker can
attempt to capture and replay requests. To mitigate this, the backend server
must carefully control early data usage, limiting its use to idempotent
requests. HTTP methods that are intended to be idempotent but which might
trigger nonidempotent changes—such as a GET request modifying a
database—must not accept early data. In such cases, the backend server
must reject requests with the HTTP Early-Data: 1
header by returning an HTTP
425 Too Early
status code.
Requests with early data have the HTTP Early-Data header set to a value of
1
, which indicates to the backend server that the request has been conveyed in
TLS early data. It also indicates that the client understands the HTTP 425 Too
Early
status code.
TLS early data (0-RTT) modes
You can configure TLS early data using one of the following modes on the target HTTPS proxy resource of the load balancer.
STRICT
. This enables TLS 1.3 early data for requests with safe HTTP methods (GET, HEAD, OPTIONS, TRACE), and HTTP requests that don't have query parameters. Requests that send early data containing nonidempotent HTTP methods (such as POST or PUT) or with query parameters are rejected with an HTTP425
status code.PERMISSIVE
. This enables TLS 1.3 early data for requests with safe HTTP methods (GET, HEAD, OPTIONS, TRACE). This mode doesn't deny requests that include query parameters. The application owner must ensure that early data is safe to use for each request path, particularly for endpoints where request replay might cause unintended side effects, such as logging or database updates triggered by GET requests.DISABLED
. TLS 1.3 early data isn't advertised, and any (invalid) attempts to send early data are rejected. If your applications aren't equipped to handle early data requests safely, you must disable early data. TLS 1.3 early data is disabled by default.UNRESTRICTED
(not recommended for most workloads). This enables TLS 1.3 early data for requests with any HTTP method including nonidempotent methods, such as POST. This mode doesn't enforce any other limitations. This mode can be valuable for gRPC use cases. However, we don't recommend this method unless you have evaluated your security posture and mitigated the risk of replay attacks using other mechanisms.
Configure TLS early data
To explicitly enable or disable TLS early data, do the following:
Console
In the Trusted Cloud console, go to the Load balancing page.
Select the load balancer for which you need to enable early data.
Click
Edit.Click Frontend configuration.
Select the frontend IP address and port that you want to edit. To enable TLS early data, the protocol must be HTTPS.
In the Early data (0-RTT) list, select a TLS early data mode.
Click Done.
To update the load balancer, click Update.
gcloud
Configure the TLS early data mode on the target HTTPS proxy of an Application Load Balancer.
gcloud compute target-https-proxies update TARGET_HTTPS_PROXY \ --tls-early-data=TLS_EARLY_DATA_MODE
Replace the following:
TARGET_HTTPS_PROXY
: the target HTTPS proxy of your load balancerTLS_EARLY_DATA_MODE
:STRICT
,PERMISSIVE
,DISABLED
, orUNRESTRICTED
API
PATCH https://compute.googleapis.com/compute/v1/projects/{project}/global/targetHttpsProxies/TARGET_HTTPS_PROXY { "tlsEarlyData":"TLS_EARLY_DATA_MODE", "fingerprint": "FINGERPRINT" }
Replace the following:
TARGET_HTTPS_PROXY
: the target HTTPS proxy of your load balancerTLS_EARLY_DATA_MODE
:STRICT
,PERMISSIVE
,DISABLED
, orUNRESTRICTED
FINGERPRINT
: a Base64 encoded string. An up-to-date fingerprint must be provided in order to patch the target HTTPS proxy; otherwise, the request fails with an HTTP412 Precondition Failed
status code.
After you have configured TLS early data, you can issue requests from an HTTP client that supports TLS early data. You can observe lower latency for resumed requests.
If a non-RFC-compliant client sends a request with a nonidempotent method or
with query parameters, the request is denied. You see an HTTP 425 Early
status
code in the load balancer logs and the following HTTP response:
HTTP/1.1 425 Too Early Content-Type: text/html; charset=UTF-8 Referrer-Policy: no-referrer Content-Length: 1558 Date: Thu, 03 Aug 2024 02:45:14 GMT Connection: close <!DOCTYPE html> <html lang=en> <title>Error 425 (Too Early)</title> <p>The request was sent to the server too early, please retry. That's all we know.</p> </html>
Limitations
- HTTPS load balancers don't send a
close_notify
closure alert when terminating SSL connections. That is, the load balancer closes the TCP connection instead of performing an SSL shutdown. - HTTPS load balancers support only lowercase characters in domains in a
common name (
CN
) attribute or a subject alternative name (SAN
) attribute of the certificate. Certificates with uppercase characters in domains are returned only when set as the primary certificate in the target proxy. - HTTPS load balancers don't use the Server Name Indication (SNI) extension when connecting to the backend, except for load balancers with Internet NEG backends. For more information, see Encryption from the load balancer to the backends.
Trusted Cloud doesn't guarantee that an underlying TCP connection can remain open for the entirety of the value of the backend service timeout. Client systems must implement retry logic instead of relying on a TCP connection to be open for long periods of time.
You can't create a regional external Application Load Balancer in Premium Tier using the Trusted Cloud console. Instead, use either the gcloud CLI or the API.
What's next
- To learn how to deploy a regional external Application Load Balancer, see Setting up a regional external Application Load Balancer with a Compute Engine backend.
- To learn how to configure advanced traffic management capabilities available with the regional external Application Load Balancer, see Traffic management overview for regional external Application Load Balancers.