Reference documentation and code samples for the Google Cloud Gke Recommender V1 Client class GenerateOptimizedManifestRequest.
Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.
Generated from protobuf message google.cloud.gkerecommender.v1.GenerateOptimizedManifestRequest
Namespace
Google \ Cloud \ GkeRecommender \ V1Methods
__construct
Constructor.
| Parameters | |
|---|---|
| Name | Description |
data |
array
Optional. Data for populating the Message object. |
↳ model_server_info |
ModelServerInfo
Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations. |
↳ accelerator_type |
string
Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given |
↳ kubernetes_namespace |
string
Optional. The kubernetes namespace to deploy the manifests in. |
↳ performance_requirements |
PerformanceRequirements
Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated. |
↳ storage_config |
StorageConfig
Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface. |
getModelServerInfo
Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
| Returns | |
|---|---|
| Type | Description |
ModelServerInfo|null |
|
hasModelServerInfo
clearModelServerInfo
setModelServerInfo
Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
| Parameter | |
|---|---|
| Name | Description |
var |
ModelServerInfo
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getAcceleratorType
Required. The accelerator type. Use
GkeInferenceQuickstart.FetchProfiles
to find valid accelerators for a given model_server_info.
| Returns | |
|---|---|
| Type | Description |
string |
|
setAcceleratorType
Required. The accelerator type. Use
GkeInferenceQuickstart.FetchProfiles
to find valid accelerators for a given model_server_info.
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getKubernetesNamespace
Optional. The kubernetes namespace to deploy the manifests in.
| Returns | |
|---|---|
| Type | Description |
string |
|
setKubernetesNamespace
Optional. The kubernetes namespace to deploy the manifests in.
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getPerformanceRequirements
Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
| Returns | |
|---|---|
| Type | Description |
PerformanceRequirements|null |
|
hasPerformanceRequirements
clearPerformanceRequirements
setPerformanceRequirements
Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
| Parameter | |
|---|---|
| Name | Description |
var |
PerformanceRequirements
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getStorageConfig
Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.
| Returns | |
|---|---|
| Type | Description |
StorageConfig|null |
|
hasStorageConfig
clearStorageConfig
setStorageConfig
Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.
| Parameter | |
|---|---|
| Name | Description |
var |
StorageConfig
|
| Returns | |
|---|---|
| Type | Description |
$this |
|