Interface PredictionServiceGrpc.AsyncService (3.77.0)
public static interface PredictionServiceGrpc.AsyncService
A service for online predictions and explanations.
Methods
chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)
public default void chatCompletions(ChatCompletionsRequest request, StreamObserver<HttpBody> responseObserver)
Exposes an OpenAI-compatible endpoint for chat completions.
Parameters |
Name |
Description |
request |
ChatCompletionsRequest
|
responseObserver |
io.grpc.stub.StreamObserver<com.google.api.HttpBody>
|
public default void countTokens(CountTokensRequest request, StreamObserver<CountTokensResponse> responseObserver)
Perform a token counting.
public default void directPredict(DirectPredictRequest request, StreamObserver<DirectPredictResponse> responseObserver)
Perform an unary online prediction request to a gRPC model server for
Vertex first-party products and frameworks.
public default void directRawPredict(DirectRawPredictRequest request, StreamObserver<DirectRawPredictResponse> responseObserver)
Perform an unary online prediction request to a gRPC model server for
custom containers.
public default void explain(ExplainRequest request, StreamObserver<ExplainResponse> responseObserver)
Perform an online explanation.
If
deployed_model_id
is specified, the corresponding DeployModel must have
explanation_spec
populated. If
deployed_model_id
is not specified, all DeployedModels must have
explanation_spec
populated.
generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
public default void generateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
Generate content with multimodal inputs.
public default void predict(PredictRequest request, StreamObserver<PredictResponse> responseObserver)
Perform an online prediction.
rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)
public default void rawPredict(RawPredictRequest request, StreamObserver<HttpBody> responseObserver)
Perform an online prediction with an arbitrary HTTP payload.
The response includes the following HTTP headers:
X-Vertex-AI-Endpoint-Id
: ID of the
Endpoint that served this
prediction.
X-Vertex-AI-Deployed-Model-Id
: ID of the Endpoint's
DeployedModel that served
this prediction.
Parameters |
Name |
Description |
request |
RawPredictRequest
|
responseObserver |
io.grpc.stub.StreamObserver<com.google.api.HttpBody>
|
public default void serverStreamingPredict(StreamingPredictRequest request, StreamObserver<StreamingPredictResponse> responseObserver)
Perform a server-side streaming online prediction request for Vertex
LLM streaming.
public default StreamObserver<StreamDirectPredictRequest> streamDirectPredict(StreamObserver<StreamDirectPredictResponse> responseObserver)
Perform a streaming online prediction request to a gRPC model server for
Vertex first-party products and frameworks.
public default StreamObserver<StreamDirectRawPredictRequest> streamDirectRawPredict(StreamObserver<StreamDirectRawPredictResponse> responseObserver)
Perform a streaming online prediction request to a gRPC model server for
custom containers.
streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
public default void streamGenerateContent(GenerateContentRequest request, StreamObserver<GenerateContentResponse> responseObserver)
Generate content with multimodal inputs with streaming support.
streamRawPredict(StreamRawPredictRequest request, StreamObserver<HttpBody> responseObserver)
public default void streamRawPredict(StreamRawPredictRequest request, StreamObserver<HttpBody> responseObserver)
Perform a streaming online prediction with an arbitrary HTTP payload.
Parameters |
Name |
Description |
request |
StreamRawPredictRequest
|
responseObserver |
io.grpc.stub.StreamObserver<com.google.api.HttpBody>
|
public default StreamObserver<StreamingPredictRequest> streamingPredict(StreamObserver<StreamingPredictResponse> responseObserver)
Perform a streaming online prediction request for Vertex first-party
products and frameworks.
public default StreamObserver<StreamingRawPredictRequest> streamingRawPredict(StreamObserver<StreamingRawPredictResponse> responseObserver)
Perform a streaming online prediction request through gRPC.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-10-11 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-10-11 UTC."],[],[]]