Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

Google Cloud Gke Recommender V1 Client - Class GenerateOptimizedManifestRequest (0.2.0)

Reference documentation and code samples for the Google Cloud Gke Recommender V1 Client class GenerateOptimizedManifestRequest.

Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.

Generated from protobuf message google.cloud.gkerecommender.v1.GenerateOptimizedManifestRequest

Namespace

Google \ Cloud \ GkeRecommender \ V1

Methods

__construct

Constructor.

Parameters
Name	Description
`data`	`array` Optional. Data for populating the Message object.
`↳ model_server_info`	`ModelServerInfo` Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
`↳ accelerator_type`	`string` Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given `model_server_info`.
`↳ kubernetes_namespace`	`string` Optional. The kubernetes namespace to deploy the manifests in.
`↳ performance_requirements`	`PerformanceRequirements` Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
`↳ storage_config`	`StorageConfig` Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

getModelServerInfo

Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Returns
Type	Description
`ModelServerInfo\|null`

hasModelServerInfo

clearModelServerInfo

setModelServerInfo

Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Parameter
Name	Description
`var`	`ModelServerInfo`

Returns
Type	Description
`$this`

getAcceleratorType

Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Returns
Type	Description
`string`

setAcceleratorType

Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Parameter
Name	Description
`var`	`string`

Returns
Type	Description
`$this`

getKubernetesNamespace

Optional. The kubernetes namespace to deploy the manifests in.

Returns
Type	Description
`string`

setKubernetesNamespace

Optional. The kubernetes namespace to deploy the manifests in.

Parameter
Name	Description
`var`	`string`

Returns
Type	Description
`$this`

getPerformanceRequirements

Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Returns
Type	Description
`PerformanceRequirements\|null`

hasPerformanceRequirements

clearPerformanceRequirements

setPerformanceRequirements

Parameter
Name	Description
`var`	`PerformanceRequirements`

Returns
Type	Description
`$this`

getStorageConfig

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Returns
Type	Description
`StorageConfig\|null`

hasStorageConfig

clearStorageConfig

setStorageConfig

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Parameter
Name	Description
`var`	`StorageConfig`

Returns
Type	Description
`$this`