Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

GKE Recommender V1 API - Class Google::Cloud::GkeRecommender::V1::StorageConfig (v0.1.1)

Reference documentation and code samples for the GKE Recommender V1 API class Google::Cloud::GkeRecommender::V1::StorageConfig.

Storage configuration for a model deployment.

Inherits

def model_bucket_uri() -> ::String

Returns

(::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This URI must point to the directory containing the model's config file (config.json) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format: gs://<bucket-name>/<path-to-model>.

def model_bucket_uri=(value) -> ::String

Parameter

value (::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This URI must point to the directory containing the model's config file (config.json) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format: gs://<bucket-name>/<path-to-model>.

Returns

(::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This URI must point to the directory containing the model's config file (config.json) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format: gs://<bucket-name>/<path-to-model>.

def xla_cache_bucket_uri() -> ::String

Returns

(::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache. If using TPUs, the XLA cache will be written to the same path as model_bucket_uri. This can speed up vLLM model preparation for repeated deployments.

def xla_cache_bucket_uri=(value) -> ::String

Parameter

value (::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache. If using TPUs, the XLA cache will be written to the same path as model_bucket_uri. This can speed up vLLM model preparation for repeated deployments.

Returns

(::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache. If using TPUs, the XLA cache will be written to the same path as model_bucket_uri. This can speed up vLLM model preparation for repeated deployments.