Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively
This tutorial shows you how to use the BigQuery
bqutil.procedure.bqml_generate_embeddings public stored procedure to iterate
through calls to the
ML.GENERATE_EMBEDDING function.
Calling the function iteratively lets you address any retryable errors that
occur due to exceeding the
quotas and limits that apply to
the function.
To review the source code for the bqutil.procedure.bqml_generate_embeddings
stored procedure in GitHub, see
bqml_generate_embeddings.sqlx.
For more information about the stored procedure parameters and usage, see the
README file.
This tutorial guides you through the following tasks:
Iterating through calls to the ML.GENERATE_EMBEDDING function, using the
remote model and the bigquery-public-data.bbc_news.fulltext public data
table with the bqutil.procedure.bqml_generate_embeddings stored procedure.
Required permissions
To run this tutorial, you need the following Identity and Access Management (IAM)
roles:
Create and use BigQuery datasets, connections, and models:
BigQuery Admin (roles/bigquery.admin).
Grant permissions to the connection's service account: Project IAM Admin
(roles/resourcemanager.projectIamAdmin).
These predefined roles contain the permissions required to perform the tasks in
this document. To see the exact permissions that are required, expand the
Required permissions section:
Required permissions
Create a dataset: bigquery.datasets.create
Create, delegate, and use a connection:
bigquery.connections.*
Set the default connection: bigquery.config.*
Set service account permissions:
resourcemanager.projects.getIamPolicy and
resourcemanager.projects.setIamPolicy
The query takes several seconds to complete, after which the
embedding model appears in the sample dataset in the Explorer pane.
Because the query uses a CREATE MODEL statement to create a model, there
are no query results.
Run the stored procedure
Run the bqutil.procedure.bqml_generate_embeddings stored procedure, which
iterates through calls to the ML.GENERATE_EMBEDDING function
using the target_dataset.embedding_model model and the
bigquery-public-data.bbc_news.fulltext public data table:
In the Trusted Cloud console, go to the BigQuery page.
CALL`bqutil.procedure.bqml_generate_embeddings`("bigquery-public-data.bbc_news.fulltext",-- source table"PROJECT_ID.target_dataset.news_body_embeddings",-- destination table"PROJECT_ID.target_dataset.embedding_model",-- model"body",-- content column["filename"],-- key columns'{}'-- optional arguments encoded as a JSON string);
Replace PROJECT_ID with the project ID of the
project you are using for this tutorial.
The stored procedure creates a target_dataset.news_body_embeddings table
to contain the output of the ML.GENERATE_EMBEDDING function.
When the query is finished running, confirm that there are no rows
in the target_dataset.news_body_embeddings table that contain a retryable
error. In the query editor, run the following statement:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eThis tutorial demonstrates how to use the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure to iteratively call the \u003ccode\u003eML.GENERATE_EMBEDDING\u003c/code\u003e function in BigQuery, helping to handle errors caused by exceeding usage quotas.\u003c/p\u003e\n"],["\u003cp\u003eThe tutorial guides you through creating a remote model over a \u003ccode\u003etext-embedding-005\u003c/code\u003e model and utilizing the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure with a public dataset to generate embeddings.\u003c/p\u003e\n"],["\u003cp\u003eKey steps involve creating a dataset and a connection, enabling the necessary APIs (BigQuery, BigQuery Connection, and Vertex AI), granting permissions to the connection's service account, and running the stored procedure to manage the iterative process of the model.\u003c/p\u003e\n"],["\u003cp\u003eThe document outlines the costs associated with using BigQuery ML and Vertex AI, as well as the necessary setup and permissions required to complete the tutorial.\u003c/p\u003e\n"]]],[],null,["Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively\n\nThis tutorial shows you how to use the BigQuery\n`bqutil.procedure.bqml_generate_embeddings` public stored procedure to iterate\nthrough calls to the\n[`ML.GENERATE_EMBEDDING` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding).\nCalling the function iteratively lets you address any retryable errors that\noccur due to exceeding the\n[quotas and limits](/bigquery/quotas#cloud_ai_service_functions) that apply to\nthe function.\n\nTo review the source code for the `bqutil.procedure.bqml_generate_embeddings`\nstored procedure in GitHub, see\n[`bqml_generate_embeddings.sqlx`](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/definitions/bqml_generate_embeddings.sqlx).\nFor more information about the stored procedure parameters and usage, see the\n[README file](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/README.md#bqml_generate_embeddings-source_table-string-target_table-string-ml_model-string-content_column-string-key_columns-array-options_string-string).\n\nThis tutorial guides you through the following tasks:\n\n- Creating a [remote model over a `text-embedding-005` model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model).\n- Iterating through calls to the `ML.GENERATE_EMBEDDING` function, using the remote model and the `bigquery-public-data.bbc_news.fulltext` public data table with the `bqutil.procedure.bqml_generate_embeddings` stored procedure.\n\nRequired permissions\n\nTo run this tutorial, you need the following Identity and Access Management (IAM)\nroles:\n\n- Create and use BigQuery datasets, connections, and models: BigQuery Admin (`roles/bigquery.admin`).\n- Grant permissions to the connection's service account: Project IAM Admin (`roles/resourcemanager.projectIamAdmin`).\n\nThese predefined roles contain the permissions required to perform the tasks in\nthis document. To see the exact permissions that are required, expand the\n**Required permissions** section: \n\nRequired permissions\n\n- Create a dataset: `bigquery.datasets.create`\n- Create, delegate, and use a connection: `bigquery.connections.*`\n- Set the default connection: `bigquery.config.*`\n- Set service account permissions: `resourcemanager.projects.getIamPolicy` and `resourcemanager.projects.setIamPolicy`\n- Create a model and run inference:\n - `bigquery.jobs.create`\n - `bigquery.models.create`\n - `bigquery.models.getData`\n - `bigquery.models.updateData`\n - `bigquery.models.updateMetadata`\n\nYou might also be able to get these permissions with\n[custom roles](/iam/docs/creating-custom-roles) or other\n[predefined roles](/iam/docs/understanding-roles).\n\nCosts\n\n\nIn this document, you use the following billable components of Google Cloud:\n\n\n- **BigQuery ML**: You incur costs for the data that you process in BigQuery.\n- **Vertex AI**: You incur costs for calls to the Vertex AI model.\n\n\nTo generate a cost estimate based on your projected usage,\nuse the [pricing calculator](/products/calculator). \nNew Google Cloud users might be eligible for a [free trial](/free). \n\n\u003cbr /\u003e\n\nFor more information about BigQuery pricing, see\n[BigQuery pricing](/bigquery/pricing).\n\nFor more information about Vertex AI pricing, see\n[Vertex AI pricing](/vertex-ai/generative-ai/pricing).\n\nBefore you begin\n\n1. In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n2.\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n3.\n\n\n Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,aiplatform.googleapis.com)\n\nCreate a dataset\n\nCreate a BigQuery dataset to store your models and sample data:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to the **BigQuery** page](https://console.cloud.google.com/bigquery)\n2. In the **Explorer** pane, click your project name.\n\n3. Click more_vert **View actions \\\u003e Create dataset**.\n\n4. On the **Create dataset** page, do the following:\n\n 1. For **Dataset ID** , enter `target_dataset`.\n\n 2. For **Location type** , select **Multi-region** , and then select\n **US (multiple regions in United States)**.\n\n 3. Leave the remaining default settings as they are, and click\n **Create dataset**.\n\nCreate the text embedding generation model\n\nCreate a remote model that represents a hosted Vertex AI\n`text-embedding-005` model:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CREATE OR REPLACE MODEL `target_dataset.embedding_model`\n REMOTE WITH CONNECTION DEFAULT\n OPTIONS (ENDPOINT = 'text-embedding-005');\n ```\n\n The query takes several seconds to complete, after which the\n `embedding` model appears in the `sample` dataset in the **Explorer** pane.\n Because the query uses a `CREATE MODEL` statement to create a model, there\n are no query results.\n\nRun the stored procedure\n\nRun the `bqutil.procedure.bqml_generate_embeddings` stored procedure, which\niterates through calls to the `ML.GENERATE_EMBEDDING` function\nusing the `target_dataset.embedding_model` model and the\n`bigquery-public-data.bbc_news.fulltext` public data table:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CALL `bqutil.procedure.bqml_generate_embeddings`(\n \"bigquery-public-data.bbc_news.fulltext\", -- source table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.news_body_embeddings\", -- destination table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.embedding_model\", -- model\n \"body\", -- content column\n [\"filename\"], -- key columns\n '{}' -- optional arguments encoded as a JSON string\n );\n ```\n\n Replace \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e with the project ID of the\n project you are using for this tutorial.\n\n The stored procedure creates a `target_dataset.news_body_embeddings` table\n to contain the output of the `ML.GENERATE_EMBEDDING` function.\n3. When the query is finished running, confirm that there are no rows\n in the `target_dataset.news_body_embeddings` table that contain a retryable\n error. In the query editor, run the following statement:\n\n ```googlesql\n SELECT *\n FROM `target_dataset.news_body_embeddings`\n WHERE ml_generate_embedding_status LIKE '%A retryable error occurred%';\n ```\n\n The query returns the message `No data to display`.\n\nClean up\n\n| **Caution** : Deleting a project has the following effects:\n|\n| - **Everything in the project is deleted.** If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.\n| - **Custom project IDs are lost.** When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an `appspot.com` URL, delete selected resources inside the project instead of deleting the whole project.\n|\n|\n| If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects\n| can help you avoid exceeding project quota limits.\n1. In the Google Cloud console, go to the **Manage resources** page.\n\n [Go to Manage resources](https://console.cloud.google.com/iam-admin/projects)\n2. In the project list, select the project that you want to delete, and then click **Delete**.\n3. In the dialog, type the project ID, and then click **Shut down** to delete the project.\n\n\u003cbr /\u003e"]]