You can query Apache Spark Iceberg tables in a
BigQuery notebook using open-source engines, such as
Spark. These tables are regular
Iceberg tables with metadata stored in BigLake metastore. The
same table can be queried from both BigQuery and
Spark.
Before you begin
- Create an Iceberg table while using Spark in a
BigQuery notebook. The table schema is stored in
BigLake metastore. For example, you can create the table with either
Dataproc, Dataproc Serverless, or a
stored procedure.
View and query a table
After creating your BigQuery resources in
Spark, you can view and query them in the
Trusted Cloud console. The following example shows you the general
steps to query a metastore table using interactive Spark:
Use the custom Iceberg catalog:
USE `CATALOG_NAME`;
Replace the following:
CATALOG_NAME
: the name of the
Spark catalog to that you're using with your
SQL job.
Create a namespace:
CREATE NAMESPACE IF NOT EXISTS NAMESPACE_NAME;
Replace the following:
NAMESPACE_NAME
: the namespace name that
references your Spark table.
Use the created namespace:
USE NAMESPACE_NAME;
Create an Iceberg table:
CREATE TABLE TABLE_NAME (id int, data string) USING ICEBERG;
Replace the following:
TABLE_NAME
: a name for your
Iceberg table.
Insert a table row:
INSERT INTO TABLE_NAME VALUES (1, "first row");
Use the Trusted Cloud console to do one of the following:
SELECT * FROM `TABLE_NAME`;
What's next
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-29 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[],[],null,["# Create and query metastore tables from Spark\n============================================\n\nYou can query Apache Spark Iceberg tables in a\nBigQuery notebook using open-source engines, such as\nSpark. These tables are regular\nIceberg tables with metadata stored in BigLake metastore. The\nsame table can be queried from both BigQuery and\nSpark.\n\nBefore you begin\n----------------\n\n- Create an Iceberg table while using Spark in a BigQuery notebook. The table schema is stored in BigLake metastore. For example, you can create the table with either [Dataproc](/bigquery/docs/blms-use-dataproc), [Dataproc Serverless](/bigquery/docs/blms-use-dataproc-serverless), or a [stored procedure](/bigquery/docs/blms-use-stored-procedures).\n\nView and query a table\n----------------------\n\nAfter creating your BigQuery resources in\nSpark, you can view and query them in the\nGoogle Cloud console. The following example shows you the general\nsteps to query a metastore table using interactive Spark:\n\n1. Use the custom Iceberg catalog:\n\n ```googlesql\n USE `\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e`;\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e: the name of the Spark catalog to that you're using with your SQL job.\n2. Create a namespace:\n\n ```googlesql\n CREATE NAMESPACE IF NOT EXISTS NAMESPACE_NAME;\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eNAMESPACE_NAME\u003c/var\u003e: the namespace name that references your Spark table.\n3. Use the created namespace:\n\n ```googlesql\n USE NAMESPACE_NAME;\n ```\n4. Create an Iceberg table:\n\n ```googlesql\n CREATE TABLE TABLE_NAME (id int, data string) USING ICEBERG;\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e: a name for your Iceberg table.\n5. Insert a table row:\n\n ```googlesql\n INSERT INTO TABLE_NAME VALUES (1, \"first row\");\n ```\n6. Use the Google Cloud console to do one of the following:\n\n - [View the table metadata](/bigquery/docs/running-queries#queries)\n - [Query the table](/bigquery/docs/running-queries#queries)\n\n ```googlesql\n SELECT * FROM `\u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e`;\n ```\n\nWhat's next\n-----------\n\n- Set up [additional BigLake metastore features](/bigquery/docs/blms-features)."]]