本页面上的部分或全部信息可能不适用于 Trusted Cloud by S3NS。
异常值检测概览
异常值检测是一种数据挖掘技术,可用于识别特定数据集中的数据偏差。例如,如果给定产品的退货率远高于该产品的基准值,则可能表示产品存在缺陷或潜在欺诈。您可以使用异常值检测功能来检测重大突发事件(例如技术问题)或机会(例如消费者行为的变化)。
使用异常值检测时,一个难点是确定哪些数据计为异常数据。如果您已为用于识别异常值的数据添加标签,则可以将 ML.PREDICT
函数与以下监督式机器学习模型之一搭配使用,执行异常值检测:
如果您不确定哪些数据计为异常数据,或者没有带标签的数据来训练模型,则可以使用非监督式机器学习执行异常值检测。将 ML.DETECT_ANOMALIES
函数与以下模型之一搭配使用,以检测训练数据或新服务数据中的异常值:
推荐的知识
通过使用 CREATE MODEL
语句和推理函数中的默认设置,即使您没有太多机器学习方面的知识,也可以创建和使用异常检测模型。不过,掌握机器学习开发方面的基本知识有助于您优化数据和模型,从而取得更好的成效。我们建议您使用以下资源熟悉机器学习技术和流程:
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-17。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-17。"],[[["\u003cp\u003eAnomaly detection is a data mining technique used to identify deviations in datasets, which can signal product defects, fraud, or changes in consumer behavior.\u003c/p\u003e\n"],["\u003cp\u003eIf you have labeled data, supervised machine learning models like linear regression, boosted trees, random forest, DNN, Wide & Deep, and AutoML models can be used with the \u003ccode\u003eML.PREDICT\u003c/code\u003e function for anomaly detection.\u003c/p\u003e\n"],["\u003cp\u003eWhen you lack labeled data or are uncertain about what constitutes anomalous data, unsupervised machine learning can be employed with the \u003ccode\u003eML.DETECT_ANOMALIES\u003c/code\u003e function.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.DETECT_ANOMALIES\u003c/code\u003e function supports various model types, including ARIMA_PLUS, ARIMA_PLUS_XREG, K-means, Autoencoder, and PCA, each suited for different data types such as time series or independent and identically distributed random variables.\u003c/p\u003e\n"],["\u003cp\u003eBasic knowledge of ML can enhance anomaly detection results, and resources such as the Machine Learning Crash Course, Intro to Machine Learning, and Intermediate Machine Learning are recommended to develop this knowledge.\u003c/p\u003e\n"]]],[],null,["# Anomaly detection overview\n==========================\n\nAnomaly detection is a data mining technique that you can use to identify data\ndeviations in a given dataset. For example, if the return rate for a given\nproduct increases substantially from the baseline for that product, that might\nindicate a product defect or potential fraud. You can use anomaly detection to\ndetect critical incidents, such as technical issues, or opportunities, such as\nchanges in consumer behavior.\n\nOne challenge when you use anomaly detection is determining what counts as\nanomalous data. If you have labeled data that identifies anomalies, you can\nperform anomaly detection by using the\n[`ML.PREDICT` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-predict)\nwith one of the following supervised machine learning models:\n\n- [Linear and logistic regression models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-glm)\n- [Boosted trees models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-boosted-tree)\n- [Random forest models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-random-forest)\n- [Deep neural network (DNN) models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-dnn-models)\n- [Wide \\& Deep models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-wnd-models)\n- [AutoML models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-automl)\n\nIf you aren't certain what counts as anomalous data, or you don't have labeled\ndata to train a model on, you can use unsupervised machine learning to perform\nanomaly detection. Use the\n[`ML.DETECT_ANOMALIES` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-detect-anomalies)\nwith one of the following models to detect anomalies in training data or new\nserving data:\n\nRecommended knowledge\n---------------------\n\nBy using the default settings in the `CREATE MODEL` statements and the\ninference functions, you can create and use an anomaly detection\nmodel even without much ML knowledge. However, having basic knowledge about\nML development helps you optimize both your data and your model to\ndeliver better results. We recommend using the following resources to develop\nfamiliarity with ML techniques and processes:\n\n- [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course)\n- [Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning)\n- [Intermediate Machine Learning](https://www.kaggle.com/learn/intermediate-machine-learning)"]]