Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

GKE Recommender V1 API - Class Google::Cloud::GkeRecommender::V1::PerformanceRange (v0.1.1)

Reference documentation and code samples for the GKE Recommender V1 API class Google::Cloud::GkeRecommender::V1::PerformanceRange.

Performance range for a model deployment.

Inherits

Object

Extended By

Google::Protobuf::MessageExts::ClassMethods

Includes

Google::Protobuf::MessageExts

Methods

#ntpot_range

def ntpot_range() -> ::Google::Cloud::GkeRecommender::V1::MillisecondRange

Returns

(::Google::Cloud::GkeRecommender::V1::MillisecondRange) — Output only. The range of NTPOT (Normalized Time Per Output Token) in milliseconds. NTPOT is the request latency normalized by the number of output tokens, measured as request_latency / total_output_tokens.

#throughput_output_range

def throughput_output_range() -> ::Google::Cloud::GkeRecommender::V1::TokensPerSecondRange

Returns

(::Google::Cloud::GkeRecommender::V1::TokensPerSecondRange) — Output only. The range of throughput in output tokens per second. This is measured as total_output_tokens_generated_by_server / elapsed_time_in_seconds.

#ttft_range

def ttft_range() -> ::Google::Cloud::GkeRecommender::V1::MillisecondRange

Returns

(::Google::Cloud::GkeRecommender::V1::MillisecondRange) — Output only. The range of TTFT (Time To First Token) in milliseconds. TTFT is the time it takes to generate the first token for a request.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-30 UTC.