Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

Vertex AI V1 API - Class Google::Cloud::AIPlatform::V1::SpeculativeDecodingSpec::NgramSpeculation (v1.36.0)

Reference documentation and code samples for the Vertex AI V1 API class Google::Cloud::AIPlatform::V1::SpeculativeDecodingSpec::NgramSpeculation.

N-Gram speculation works by trying to find matching tokens in the previous prompt sequence and use those as speculation for generating new tokens.

Inherits

Object

Extended By

Google::Protobuf::MessageExts::ClassMethods

Includes

Google::Protobuf::MessageExts

Methods

#ngram_size

def ngram_size() -> ::Integer

Returns

(::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.

#ngram_size=

def ngram_size=(value) -> ::Integer

Parameter

value (::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.

Returns

(::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-26 UTC.