Google Cloud Gke Recommender V1 Client - Class GenerateOptimizedManifestRequest (0.1.0)

Reference documentation and code samples for the Google Cloud Gke Recommender V1 Client class GenerateOptimizedManifestRequest.

Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.

Generated from protobuf message google.cloud.gkerecommender.v1.GenerateOptimizedManifestRequest

Namespace

Google \ Cloud \ GkeRecommender \ V1

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ model_server_info ModelServerInfo

Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

↳ accelerator_type string

Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

↳ kubernetes_namespace string

Optional. The kubernetes namespace to deploy the manifests in.

↳ performance_requirements PerformanceRequirements

Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

↳ storage_config StorageConfig

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

getModelServerInfo

Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Returns
Type Description
ModelServerInfo|null

hasModelServerInfo

clearModelServerInfo

setModelServerInfo

Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Parameter
Name Description
var ModelServerInfo
Returns
Type Description
$this

getAcceleratorType

Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Returns
Type Description
string

setAcceleratorType

Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Parameter
Name Description
var string
Returns
Type Description
$this

getKubernetesNamespace

Optional. The kubernetes namespace to deploy the manifests in.

Returns
Type Description
string

setKubernetesNamespace

Optional. The kubernetes namespace to deploy the manifests in.

Parameter
Name Description
var string
Returns
Type Description
$this

getPerformanceRequirements

Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Returns
Type Description
PerformanceRequirements|null

hasPerformanceRequirements

clearPerformanceRequirements

setPerformanceRequirements

Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Parameter
Name Description
var PerformanceRequirements
Returns
Type Description
$this

getStorageConfig

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Returns
Type Description
StorageConfig|null

hasStorageConfig

clearStorageConfig

setStorageConfig

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Parameter
Name Description
var StorageConfig
Returns
Type Description
$this