Reference documentation and code samples for the GKE Recommender V1 API class Google::Cloud::GkeRecommender::V1::GenerateOptimizedManifestRequest.
Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.
Inherits
- Object
Extended By
- Google::Protobuf::MessageExts::ClassMethods
Includes
- Google::Protobuf::MessageExts
Methods
#accelerator_type
def accelerator_type() -> ::String
Returns
-
(::String) — Required. The accelerator type. Use
GkeInferenceQuickstart.FetchProfiles
to find valid accelerators for a given
model_server_info
.
#accelerator_type=
def accelerator_type=(value) -> ::String
Parameter
-
value (::String) — Required. The accelerator type. Use
GkeInferenceQuickstart.FetchProfiles
to find valid accelerators for a given
model_server_info
.
Returns
-
(::String) — Required. The accelerator type. Use
GkeInferenceQuickstart.FetchProfiles
to find valid accelerators for a given
model_server_info
.
#kubernetes_namespace
def kubernetes_namespace() -> ::String
Returns
- (::String) — Optional. The kubernetes namespace to deploy the manifests in.
#kubernetes_namespace=
def kubernetes_namespace=(value) -> ::String
Parameter
- value (::String) — Optional. The kubernetes namespace to deploy the manifests in.
Returns
- (::String) — Optional. The kubernetes namespace to deploy the manifests in.
#model_server_info
def model_server_info() -> ::Google::Cloud::GkeRecommender::V1::ModelServerInfo
Returns
- (::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
#model_server_info=
def model_server_info=(value) -> ::Google::Cloud::GkeRecommender::V1::ModelServerInfo
Parameter
- value (::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
Returns
- (::Google::Cloud::GkeRecommender::V1::ModelServerInfo) — Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
#performance_requirements
def performance_requirements() -> ::Google::Cloud::GkeRecommender::V1::PerformanceRequirements
Returns
- (::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
#performance_requirements=
def performance_requirements=(value) -> ::Google::Cloud::GkeRecommender::V1::PerformanceRequirements
Parameter
- value (::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
Returns
- (::Google::Cloud::GkeRecommender::V1::PerformanceRequirements) — Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
#storage_config
def storage_config() -> ::Google::Cloud::GkeRecommender::V1::StorageConfig
Returns
- (::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.
#storage_config=
def storage_config=(value) -> ::Google::Cloud::GkeRecommender::V1::StorageConfig
Parameter
- value (::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.
Returns
- (::Google::Cloud::GkeRecommender::V1::StorageConfig) — Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.