Reference documentation and code samples for the Vertex AI V1 API class Google::Cloud::AIPlatform::V1::SpeculativeDecodingSpec::NgramSpeculation.
N-Gram speculation works by trying to find matching tokens in the previous prompt sequence and use those as speculation for generating new tokens.
Inherits
- Object
Extended By
- Google::Protobuf::MessageExts::ClassMethods
Includes
- Google::Protobuf::MessageExts
Methods
#ngram_size
def ngram_size() -> ::Integer
Returns
- (::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.
#ngram_size=
def ngram_size=(value) -> ::Integer
Parameter
- value (::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.
Returns
- (::Integer) — The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.