BigQuery ML supports different input feature types for different model types.
Supported input feature types are listed in the following table:
BigQuery ML supports ARRAY<numerical>
as dense vector input
during model training. The embedding feature is a special type of dense vector.
see the ML.GENERATE_EMBEDDING
function for more information.
BigQuery ML supports ARRAY<STRUCT>
as sparse input during
model training. Each struct contains an INT64
value that represents its
zero-based index, and a
numeric type
that represents the corresponding value.
Below is an example of a sparse tensor input for the integer array
[0,1,0,0,0,0,1]
:
ARRAY<STRUCT<k INT64, v INT64>>[(1, 1), (6, 1)] AS f1
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-25 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eBigQuery ML accommodates various input feature types, tailored to different model categories such as supervised, unsupervised, and time series models.\u003c/p\u003e\n"],["\u003cp\u003eNumeric, categorical, timestamp, struct, geography, and array types are supported across many BigQuery ML models, with specific models having certain specificities.\u003c/p\u003e\n"],["\u003cp\u003eDense vector input is supported using \u003ccode\u003eARRAY<numerical>\u003c/code\u003e for model training, which includes a special embedding feature as seen in the \u003ccode\u003eML.GENERATE_EMBEDDING\u003c/code\u003e function.\u003c/p\u003e\n"],["\u003cp\u003eSparse input during model training is supported through the use of \u003ccode\u003eARRAY<STRUCT>\u003c/code\u003e, where each struct contains an \u003ccode\u003eINT64\u003c/code\u003e index and a numeric value.\u003c/p\u003e\n"],["\u003cp\u003eMatrix Factorization and ARIMA_PLUS models have unique input requirements, with the provided input types for ARIMA_PLUS_XREG only applying to external regressors.\u003c/p\u003e\n"]]],[],null,["# Supported input feature types\n=============================\n\nBigQuery ML supports different input feature types for different model types.\nSupported input feature types are listed in the following table:\n\n| **Note:** [Matrix Factorization](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-matrix-factorization#inputs) and [ARIMA_PLUS](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-time-series#time_series_data_col) models have special input feature types. The input types listed for [ARIMA_PLUS_XREG](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-multivariate-time-series#time_series_data_col) are only for external regressors.\n\nDense vector input\n------------------\n\nBigQuery ML supports `ARRAY\u003cnumerical\u003e` as dense vector input\nduring model training. The embedding feature is a special type of dense vector.\nsee the [`ML.GENERATE_EMBEDDING` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding) for more information.\n\nSparse input\n------------\n\nBigQuery ML supports `ARRAY\u003cSTRUCT\u003e` as sparse input during\nmodel training. Each struct contains an `INT64` value that represents its\nzero-based index, and a\n[numeric type](/bigquery/docs/reference/standard-sql/data-types#numeric_types)\nthat represents the corresponding value.\n\nBelow is an example of a sparse tensor input for the integer array\n`[0,1,0,0,0,0,1]`: \n\n ARRAY\u003cSTRUCT\u003ck INT64, v INT64\u003e\u003e[(1, 1), (6, 1)] AS f1"]]