- 1.122.0 (latest)
- 1.121.0
- 1.120.0
- 1.119.0
- 1.118.0
- 1.117.0
- 1.95.1
- 1.94.0
- 1.93.1
- 1.92.0
- 1.91.0
- 1.90.0
- 1.89.0
- 1.88.0
- 1.87.0
- 1.86.0
- 1.85.0
- 1.84.0
- 1.83.0
- 1.82.0
- 1.81.0
- 1.80.0
- 1.79.0
- 1.78.0
- 1.77.0
- 1.76.0
- 1.75.0
- 1.74.0
- 1.73.0
- 1.72.0
- 1.71.1
- 1.70.0
- 1.69.0
- 1.68.0
- 1.67.1
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
Evals(api_client_: google.genai._api_client.BaseApiClient)API documentation for Evals class.
Methods
batch_evaluate
batch_evaluate(
    *,
    dataset: typing.Union[
        vertexai._genai.types.EvaluationDataset,
        vertexai._genai.types.EvaluationDatasetDict,
    ],
    metrics: list[
        typing.Union[vertexai._genai.types.Metric, vertexai._genai.types.MetricDict]
    ],
    dest: str,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.EvaluateDatasetConfig,
            vertexai._genai.types.EvaluateDatasetConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluateDatasetOperationEvaluates a dataset based on a set of given metrics.
create_evaluation_run
create_evaluation_run(
    *,
    name: str,
    display_name: typing.Optional[str] = None,
    data_source: vertexai._genai.types.EvaluationRunDataSource,
    dest: str,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.CreateEvaluationRunConfig,
            vertexai._genai.types.CreateEvaluationRunConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationRunCreates an EvaluationRun.
evaluate
evaluate(
    *,
    dataset: typing.Union[
        vertexai._genai.types.EvaluationDataset,
        vertexai._genai.types.EvaluationDatasetDict,
        list[
            typing.Union[
                vertexai._genai.types.EvaluationDataset,
                vertexai._genai.types.EvaluationDatasetDict,
            ]
        ],
    ],
    metrics: typing.Optional[
        list[
            typing.Union[vertexai._genai.types.Metric, vertexai._genai.types.MetricDict]
        ]
    ] = None,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.EvaluateMethodConfig,
            vertexai._genai.types.EvaluateMethodConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationResultEvaluates candidate responses in the provided dataset(s) using the specified metrics.
evaluate_instances
evaluate_instances(
    *, metric_config: vertexai._genai.types._EvaluateInstancesRequestParameters
) -> vertexai._genai.types.EvaluateInstancesResponseEvaluates an instance of a model.
generate_rubrics
generate_rubrics(
    *,
    src: typing.Union[str, pd.DataFrame, vertexai._genai.types.EvaluationDataset],
    rubric_group_name: str,
    prompt_template: typing.Optional[str] = None,
    generator_model_config: typing.Optional[genai_types.AutoraterConfigOrDict] = None,
    rubric_content_type: typing.Optional[types.RubricContentType] = None,
    rubric_type_ontology: typing.Optional[list[str]] = None,
    predefined_spec_name: typing.Optional[
        typing.Union[str, types.PrebuiltMetric]
    ] = None,
    metric_spec_parameters: typing.Optional[dict[str, typing.Any]] = None,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.RubricGenerationConfig,
            vertexai._genai.types.RubricGenerationConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationDatasetGenerates rubrics for each prompt in the source and adds them as a new column structured as a dictionary.
You can generate rubrics by providing either:
- A predefined_spec_nameto use a Vertex AI backend recipe.
- A prompt_templatealong with other configuration parameters (generator_model_config,rubric_content_type,rubric_type_ontology) for custom rubric generation.
These two modes are mutually exclusive.
get_evaluation_item
get_evaluation_item(
    *,
    name: str,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.GetEvaluationItemConfig,
            vertexai._genai.types.GetEvaluationItemConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationItemRetrieves an EvaluationItem from the resource name.
get_evaluation_run
get_evaluation_run(
    *,
    name: str,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.GetEvaluationRunConfig,
            vertexai._genai.types.GetEvaluationRunConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationRunRetrieves an EvaluationRun from the resource name.
get_evaluation_set
get_evaluation_set(
    *,
    name: str,
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.GetEvaluationSetConfig,
            vertexai._genai.types.GetEvaluationSetConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationSetRetrieves an EvaluationSet from the resource name.
run
run() -> vertexai._genai.types.EvaluateInstancesResponseEvaluates an instance of a model.
This should eventually call _evaluate_instances()
run_inference
run_inference(
    *,
    model: typing.Union[str, typing.Callable[[typing.Any], typing.Any]],
    src: typing.Union[
        str, pandas.core.frame.DataFrame, vertexai._genai.types.EvaluationDataset
    ],
    config: typing.Optional[
        typing.Union[
            vertexai._genai.types.EvalRunInferenceConfig,
            vertexai._genai.types.EvalRunInferenceConfigDict,
        ]
    ] = None
) -> vertexai._genai.types.EvaluationDatasetRuns inference on a dataset for evaluation.