Some or all of the information on this page might not apply to Cloud de Confiance by S3NS. See Differences from Google Cloud for more details.

Anomaly detection overview

Anomaly detection is a data mining technique that you can use to identify data deviations in a given dataset. For example, if the return rate for a given product increases substantially from the baseline for that product, that might indicate a product defect or potential fraud. You can use anomaly detection to detect critical incidents, such as technical issues, or opportunities, such as changes in consumer behavior.

It can be challenging to determine what counts as anomalous data. If you aren't certain what counts as anomalous data, or you don't have labeled data to train a model on, you can use unsupervised machine learning to perform anomaly detection. Use the AI.DETECT_ANOMALIES function or ML.DETECT_ANOMALIES function with one of the following models to detect anomalies in training data or new serving data:

Data type	Model types	Function	What the function does
Time series	`TimesFM`	`AI.DETECT_ANOMALIES`	Detect the anomalies in the time series.
	`ARIMA_PLUS`	`ML.DETECT_ANOMALIES`	Detect the anomalies in the time series.
	`ARIMA_PLUS_XREG`	`ML.DETECT_ANOMALIES`	Detect the anomalies in the time series with external regressors.
Independent and identically distributed random variables (IID)	K-means	`ML.DETECT_ANOMALIES`	Detect anomalies based on the shortest distance among the normalized distances from the input data to each cluster centroid. For a definition of normalized distances, see the k-means model output for the `ML.DETECT_ANOMALIES` function.
	Autoencoder		Detect anomalies based on the reconstruction loss in terms of mean squared error. For more information, see `ML.RECONSTRUCTION_LOSS`. The `ML.RECONSTRUCTION_LOSS` function can retrieve all types of reconstruction loss.
	PCA		Detect anomalies based upon the reconstruction loss in terms of mean squared error.

If you already have labeled data that identifies anomalies, you can perform anomaly detection by using the ML.PREDICT function with one of the following supervised machine learning models:

Recommended knowledge

By using the default settings in the CREATE MODEL statements and the inference functions, you can create and use an anomaly detection model even without much ML knowledge. However, having basic knowledge about ML development helps you optimize both your data and your model to deliver better results. We recommend using the following resources to develop familiarity with ML techniques and processes: