Regression overview
A common use case for machine learning is predicting the value of a numerical
metric for new data by using a model trained on similar historical data.
For example, you might want to predict a house's expected sale price. By using
the house's location and characteristics as features, you can compare this house
to similar houses that have already sold, and use their sales prices to estimate
the house's sale price.
You can use any of the following models in combination with the
ML.PREDICT
function
to perform regression:
Recommended knowledge
By using the default settings in the CREATE MODEL
statements and the
ML.PREDICT
function, you can create and use a regression model even
without much ML knowledge. However, having basic knowledge about
ML development helps you optimize both your data and your model to
deliver better results. We recommend using the following resources to develop
familiarity with ML techniques and processes:
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-25 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eRegression models are used to predict numerical values for new data based on patterns learned from historical data, such as predicting a house's sale price.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.PREDICT\u003c/code\u003e function can be used in conjunction with various models, including linear regression, boosted tree, random forest, deep neural network (DNN), wide & deep, and AutoML models, to perform regression.\u003c/p\u003e\n"],["\u003cp\u003eYou can create and use a regression model with default settings without extensive machine learning (ML) knowledge, though basic ML familiarity can help improve results.\u003c/p\u003e\n"],["\u003cp\u003eSeveral resources like Google's Machine Learning Crash Course and Kaggle's ML tutorials are available to help build familiarity with ML techniques and processes.\u003c/p\u003e\n"]]],[],null,["# Regression overview\n===================\n\nA common use case for machine learning is predicting the value of a numerical\nmetric for new data by using a model trained on similar historical data.\nFor example, you might want to predict a house's expected sale price. By using\nthe house's location and characteristics as features, you can compare this house\nto similar houses that have already sold, and use their sales prices to estimate\nthe house's sale price.\n\nYou can use any of the following models in combination with the\n[`ML.PREDICT` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-predict)\nto perform regression:\n\n- [Linear regression models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-glm): use [linear regression](https://developers.google.com/machine-learning/crash-course/linear-regression) by setting the `MODEL_TYPE` option to `LINEAR_REG`.\n- [Boosted tree models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-boosted-tree): use a [gradient boosted decision tree](https://developers.google.com/machine-learning/decision-forests/intro-to-gbdt) by setting the `MODEL_TYPE` option to `BOOSTED_TREE_REGRESSOR`.\n- [Random forest models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-random-forest): use a [random forest](https://developers.google.com/machine-learning/decision-forests/intro-to-decision-forests) by setting the `MODEL_TYPE` option to `RANDOM_FOREST_REGRESSOR`.\n- [Deep neural network (DNN) models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-dnn-models): use a [neural network](https://developers.google.com/machine-learning/crash-course/neural-networks) by setting the `MODEL_TYPE` option to `DNN_REGRESSOR`.\n- [Wide \\& Deep models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-wnd-models): use [wide \\& deep learning](https://dl.acm.org/doi/10.1145/2988450.2988454) by setting the `MODEL_TYPE` option to `DNN_LINEAR_COMBINED_REGRESSOR`.\n- [AutoML models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-automl): use an [AutoML classification model](/vertex-ai/docs/tabular-data/classification-regression/overview) by setting the `MODEL_TYPE` option to `AUTOML_REGRESSOR`.\n\nRecommended knowledge\n---------------------\n\nBy using the default settings in the `CREATE MODEL` statements and the\n`ML.PREDICT` function, you can create and use a regression model even\nwithout much ML knowledge. However, having basic knowledge about\nML development helps you optimize both your data and your model to\ndeliver better results. We recommend using the following resources to develop\nfamiliarity with ML techniques and processes:\n\n- [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course)\n- [Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning)\n- [Intermediate Machine Learning](https://www.kaggle.com/learn/intermediate-machine-learning)"]]