Some or all of the information on this page might not apply to Trusted Cloud by S3NS. See Differences from Google Cloud for more details.

GKE Recommender V1 API - Class Google::Cloud::GkeRecommender::V1::GenerateOptimizedManifestRequest (v0.1.0)

Reference documentation and code samples for the GKE Recommender V1 API class Google::Cloud::GkeRecommender::V1::GenerateOptimizedManifestRequest.

Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.

Inherits

Object

Extended By

Google::Protobuf::MessageExts::ClassMethods

Includes

Google::Protobuf::MessageExts

Methods

#accelerator_type

def accelerator_type() -> ::String

Returns

(::String) — Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

#accelerator_type=

def accelerator_type=(value) -> ::String

Parameter

value (::String) — Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Returns

(::String) — Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

#kubernetes_namespace

def kubernetes_namespace() -> ::String

Returns

(::String) — Optional. The kubernetes namespace to deploy the manifests in.

#kubernetes_namespace=

def kubernetes_namespace=(value) -> ::String

Parameter

value (::String) — Optional. The kubernetes namespace to deploy the manifests in.

Returns

(::String) — Optional. The kubernetes namespace to deploy the manifests in.

#model_server_info

def model_server_info() -> ::Google::Cloud::GkeRecommender::V1::ModelServerInfo

Returns

(::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

#model_server_info=

def model_server_info=(value) -> ::Google::Cloud::GkeRecommender::V1::ModelServerInfo

Parameter

value (::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Returns

(::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

#performance_requirements

def performance_requirements() -> ::Google::Cloud::GkeRecommender::V1::PerformanceRequirements

Returns

(::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

#performance_requirements=

def performance_requirements=(value) -> ::Google::Cloud::GkeRecommender::V1::PerformanceRequirements

Parameter

value (::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Returns

(::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

#storage_config

def storage_config() -> ::Google::Cloud::GkeRecommender::V1::StorageConfig

Returns

(::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

#storage_config=

def storage_config=(value) -> ::Google::Cloud::GkeRecommender::V1::StorageConfig

Parameter

value (::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Returns

(::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.