gcloud container ai profiles

INFORMATION
gcloud container ai profiles is supported in universe domain universe; however, some of the values used in the help text may not be available. Command examples may not work as-is and may requires changes before execution.
NAME
gcloud container ai profiles - quickstart engine for GKE AI workloads
SYNOPSIS
gcloud container ai profiles GROUP | COMMAND [GCLOUD_WIDE_FLAG]
DESCRIPTION
The GKE Inference Quickstart helps simplify deploying AI inference on Google Kubernetes Engine (GKE). It provides tailored profiles based on Google's internal benchmarks. Provide inputs like your preferred open-source model (e.g. Llama, Gemma, or Mistral) and your application's performance target. Based on these inputs, the quickstart generates accelerator choices with performance metrics, and detailed, ready-to-deploy profiles for compute, load balancing, and autoscaling. These profiles are provided as standard Kubernetes YAML manifests, which you can deploy or modify.
GCLOUD WIDE FLAGS
These flags are available to all commands: --help.

Run $ gcloud help for details.

GROUPS
GROUP is one of the following:
benchmarks
Manage benchmarks for GKE Inference Quickstart.
manifests
Generate optimized Kubernetes manifests.
model-server-versions
Manage supported model server versions for GKE Inference Quickstart.
model-servers
Manage supported model servers for GKE Inference Quickstart.
models
Manage supported models for GKE Inference Quickstart.
COMMANDS
COMMAND is one of the following:
list
List compatible accelerator profiles.
NOTES
This variant is also available:
gcloud alpha container ai profiles