Reference documentation and code samples for the GKE Recommender V1 API class Google::Cloud::GkeRecommender::V1::StorageConfig.
Storage configuration for a model deployment.
Inherits
- Object
Extended By
- Google::Protobuf::MessageExts::ClassMethods
Includes
- Google::Protobuf::MessageExts
Methods
#model_bucket_uri
def model_bucket_uri() -> ::String
Returns
-
(::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This
URI must point to the directory containing the model's config file
(
config.json
) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format:gs://<bucket-name>/<path-to-model>
.
#model_bucket_uri=
def model_bucket_uri=(value) -> ::String
Parameter
-
value (::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This
URI must point to the directory containing the model's config file
(
config.json
) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format:gs://<bucket-name>/<path-to-model>
.
Returns
-
(::String) — Optional. The Google Cloud Storage bucket URI to load the model from. This
URI must point to the directory containing the model's config file
(
config.json
) and model weights. A tuned GCSFuse setup can improve LLM Pod startup time by more than 7x. Expected format:gs://<bucket-name>/<path-to-model>
.
#xla_cache_bucket_uri
def xla_cache_bucket_uri() -> ::String
Returns
-
(::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache.
If using TPUs, the XLA cache will be written to the same path as
model_bucket_uri
. This can speed up vLLM model preparation for repeated deployments.
#xla_cache_bucket_uri=
def xla_cache_bucket_uri=(value) -> ::String
Parameter
-
value (::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache.
If using TPUs, the XLA cache will be written to the same path as
model_bucket_uri
. This can speed up vLLM model preparation for repeated deployments.
Returns
-
(::String) — Optional. The URI for the GCS bucket containing the XLA compilation cache.
If using TPUs, the XLA cache will be written to the same path as
model_bucket_uri
. This can speed up vLLM model preparation for repeated deployments.