Module ai (2.22.0)

This module integrates BigQuery built-in AI functions for use with Series/DataFrame objects, such as AI.GENERATE_BOOL: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate-bool

Modules Functions

generate_bool

generate_bool(
    prompt: typing.Union[
        bigframes.series.Series,
        pandas.core.series.Series,
        typing.List[
            typing.Union[str, bigframes.series.Series, pandas.core.series.Series]
        ],
        typing.Tuple[
            typing.Union[str, bigframes.series.Series, pandas.core.series.Series], ...
        ],
    ],
    *,
    connection_id: str | None = None,
    endpoint: str | None = None,
    request_type: typing.Literal["dedicated", "shared", "unspecified"] = "unspecified",
    model_params: typing.Optional[typing.Mapping[typing.Any, typing.Any]] = None
) -> bigframes.series.Series

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({
...     "col_1": ["apple", "bear", "pear"],
...     "col_2": ["fruit", "animal", "animal"]
... })
>>> bbq.ai.generate_bool((df["col_1"], " is a ", df["col_2"]))
0    {'result': True, 'full_response': '{"candidate...
1    {'result': True, 'full_response': '{"candidate...
2    {'result': False, 'full_response': '{"candidat...
dtype: struct<result: bool, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate_bool((df["col_1"], " is a ", df["col_2"])).struct.field("result")
0     True
1     True
2    False
Name: result, dtype: boolean
Parameters
Name Description
prompt Series List[str|Series] Tuple[str|Series, ...]

A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series or pandas Series.

connection_id str, optional

Specifies the connection to use to communicate with the model. For example, myproject.us.myconnection. If not provided, the connection from the current session will be used.

endpoint str, optional

Specifies the Vertex AI endpoint to use for the model. For example "gemini-2.5-flash". You can specify any generally available or preview Gemini model. If you specify the model name, BigQuery ML automatically identifies and uses the full endpoint of the model. If you don't specify an ENDPOINT value, BigQuery ML selects a recent stable version of Gemini to use.

request_type Literal["dedicated", "shared", "unspecified"]

Specifies the type of inference request to send to the Gemini model. The request type determines what quota the request uses. * "dedicated": function only uses Provisioned Throughput quota. The function returns the error Provisioned throughput is not purchased or is not active if Provisioned Throughput quota isn't available. * "shared": the function only uses dynamic shared quota (DSQ), even if you have purchased Provisioned Throughput quota. * "unspecified": If you haven't purchased Provisioned Throughput quota, the function uses DSQ quota. If you have purchased Provisioned Throughput quota, the function uses the Provisioned Throughput quota first. If requests exceed the Provisioned Throughput quota, the overflow traffic uses DSQ quota.

model_params Mapping[Any, Any]

Provides additional parameters to the model. The MODEL_PARAMS value must conform to the generateContent request body format.

Returns
Type Description
bigframes.series.Series A new struct Series with the result data. The struct contains these fields: * "result": a BOOL value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

generate_int

generate_int(
    prompt: typing.Union[
        bigframes.series.Series,
        pandas.core.series.Series,
        typing.List[
            typing.Union[str, bigframes.series.Series, pandas.core.series.Series]
        ],
        typing.Tuple[
            typing.Union[str, bigframes.series.Series, pandas.core.series.Series], ...
        ],
    ],
    *,
    connection_id: str | None = None,
    endpoint: str | None = None,
    request_type: typing.Literal["dedicated", "shared", "unspecified"] = "unspecified",
    model_params: typing.Optional[typing.Mapping[typing.Any, typing.Any]] = None
) -> bigframes.series.Series

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> bpd.options.display.progress_bar = None
>>> animal = bpd.Series(["Kangaroo", "Rabbit", "Spider"])
>>> bbq.ai.generate_int(("How many legs does a ", animal, " have?"))
0    {'result': 2, 'full_response': '{"candidates":...
1    {'result': 4, 'full_response': '{"candidates":...
2    {'result': 8, 'full_response': '{"candidates":...
dtype: struct<result: int64, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate_int(("How many legs does a ", animal, " have?")).struct.field("result")
0    2
1    4
2    8
Name: result, dtype: Int64
Parameters
Name Description
prompt Series List[str|Series] Tuple[str|Series, ...]

A mixture of Series and string literals that specifies the prompt to send to the model. The Series can be BigFrames Series or pandas Series.

connection_id str, optional

Specifies the connection to use to communicate with the model. For example, myproject.us.myconnection. If not provided, the connection from the current session will be used.

endpoint str, optional

Specifies the Vertex AI endpoint to use for the model. For example "gemini-2.5-flash". You can specify any generally available or preview Gemini model. If you specify the model name, BigQuery ML automatically identifies and uses the full endpoint of the model. If you don't specify an ENDPOINT value, BigQuery ML selects a recent stable version of Gemini to use.

request_type Literal["dedicated", "shared", "unspecified"]

Specifies the type of inference request to send to the Gemini model. The request type determines what quota the request uses. * "dedicated": function only uses Provisioned Throughput quota. The function returns the error Provisioned throughput is not purchased or is not active if Provisioned Throughput quota isn't available. * "shared": the function only uses dynamic shared quota (DSQ), even if you have purchased Provisioned Throughput quota. * "unspecified": If you haven't purchased Provisioned Throughput quota, the function uses DSQ quota. If you have purchased Provisioned Throughput quota, the function uses the Provisioned Throughput quota first. If requests exceed the Provisioned Throughput quota, the overflow traffic uses DSQ quota.

model_params Mapping[Any, Any]

Provides additional parameters to the model. The MODEL_PARAMS value must conform to the generateContent request body format.

Returns
Type Description
bigframes.series.Series A new struct Series with the result data. The struct contains these fields: * "result": an integer (INT64) value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.