Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

Interface InferenceParameterOrBuilder (4.68.0)

public interface InferenceParameterOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getMaxOutputTokens()

public abstract int getMaxOutputTokens()

Optional. Maximum number of the output tokens for the generator.

optional int32 max_output_tokens = 1 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`int`	The maxOutputTokens.

getTemperature()

public abstract double getTemperature()

Optional. Controls the randomness of LLM predictions. Low temperature = less random. High temperature = more random. If unset (or 0), uses a default value of 0.

optional double temperature = 2 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`double`	The temperature.

getTopK()

public abstract int getTopK()

Optional. Top-k changes how the model selects tokens for output. A top-k of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature). For each token selection step, the top K tokens with the highest probabilities are sampled. Then tokens are further filtered based on topP with the final token selected using temperature sampling. Specify a lower value for less random responses and a higher value for more random responses. Acceptable value is [1, 40], default to 40.

optional int32 top_k = 3 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`int`	The topK.

getTopP()

public abstract double getTopP()

Optional. Top-p changes how the model selects tokens for output. Tokens are selected from most K (see topK parameter) probable to least until the sum of their probabilities equals the top-p value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-p value is 0.5, then the model will select either A or B as the next token (using temperature) and doesn't consider C. The default top-p value is 0.95. Specify a lower value for less random responses and a higher value for more random responses. Acceptable value is [0.0, 1.0], default to 0.95.

optional double top_p = 4 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`double`	The topP.

hasMaxOutputTokens()

public abstract boolean hasMaxOutputTokens()

Optional. Maximum number of the output tokens for the generator.

optional int32 max_output_tokens = 1 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`boolean`	Whether the maxOutputTokens field is set.

hasTemperature()

public abstract boolean hasTemperature()

Optional. Controls the randomness of LLM predictions. Low temperature = less random. High temperature = more random. If unset (or 0), uses a default value of 0.

optional double temperature = 2 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`boolean`	Whether the temperature field is set.

hasTopK()

public abstract boolean hasTopK()

optional int32 top_k = 3 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`boolean`	Whether the topK field is set.

hasTopP()

public abstract boolean hasTopP()

optional double top_p = 4 [(.google.api.field_behavior) = OPTIONAL];

Returns
Type	Description
`boolean`	Whether the topP field is set.