本页面上的部分或全部信息可能不适用于 Trusted Cloud by S3NS。
Trusted Cloud by S3NS 上的 GPU 简介
Trusted Cloud by S3NS 致力于提供卓越的人工智能 (AI) 基础架构来为广泛领域中要求严苛的 GPU 加速工作负载提供支持。您可以使用 Trusted Cloud by S3NS 上的 GPU 来运行 AI、机器学习 (ML)、科学、分析、工程、消费者和企业应用。
通过与 NVIDIA 合作, Trusted Cloud by S3NS 可提供最新的 GPU,同时可通过众多存储和网络选项优化软件堆栈。如需查看可用的 GPU 的完整列表,请参阅 GPU 平台。
以下部分概述了 Trusted Cloud by S3NS上的 GPU 的优势。
GPU 加速虚拟机
在 Trusted Cloud by S3NS上,您能够以最适合您需求的方式访问和配置 GPU。我们提供专门的加速器优化型机器家族,其中的机器类型预先挂接了 GPU 并可提供卓越的网络功能,能够最大限度地提高机器性能。该系列包括 A4X、A4、A3、A2 和 G2 机器类型。
多种预配选项
您可以将加速器优化型机器家族与以下任一开源产品或 Trusted Cloud by S3NS 产品搭配使用来预配集群。
Vertex AI
Vertex AI 是一个全托管式机器学习 (ML) 平台,可用于训练和部署机器学习模型及 AI 应用。在 Vertex AI 应用中,您可以通过以下方式使用 GPU 加速虚拟机来提升性能:
Cluster Director
Cluster Director(以前称为 Hypercompute Cluster)是一组功能和服务,旨在让您部署和管理大量(最多数万个)加速器和网络资源,这些资源可作为单个同构单元运行。此选项非常适合创建密集分配且性能经过优化的基础架构,并集成了 Google Kubernetes Engine (GKE) 和 Slurm 调度器。Cluster Director 可帮助您构建专门用于运行 AI、机器学习和 HPC 工作负载的基础架构。如需了解详情,请参阅 Cluster Director。
如需开始使用 Cluster Director,请参阅选择部署策略。
Compute Engine
您还可以在 Compute Engine 上创建和管理挂接了 GPU 的单个虚拟机或较小的虚拟机集群。此方法主要用于运行图形密集型工作负载、模拟工作负载或小规模机器学习模型训练。
下表显示了可用于创建已挂接 GPU 的虚拟机的方法:
Cloud Run
您可以为 Cloud Run 实例配置 GPU。GPU 非常适合在 Cloud Run 上使用大语言模型运行 AI 推理工作负载。
在 Cloud Run 上,请参阅以下资源,了解如何在 GPU 上运行 AI 工作负载:
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-18。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[[["\u003cp\u003eGoogle Cloud provides robust GPU infrastructure to support demanding AI, machine learning, scientific, and enterprise workloads, with optimized software and networking options.\u003c/p\u003e\n"],["\u003cp\u003eUsers can leverage GPU-accelerated virtual machines (VMs) via accelerator-optimized machine families like A3, A2, and G2, designed for maximum performance.\u003c/p\u003e\n"],["\u003cp\u003eVertex AI offers various ways to utilize GPU-enabled VMs to improve model training, deployment, and prediction latency, along with supporting open-source large language models.\u003c/p\u003e\n"],["\u003cp\u003eHypercompute Cluster enables the creation of clusters of GPU-accelerated VMs, managed as a single unit, suited for AI, ML, and high-performance computing (HPC) workloads.\u003c/p\u003e\n"],["\u003cp\u003eCompute Engine provides options to create individual VMs or small clusters with attached GPUs for graphics-intensive tasks and small-scale model training, while Cloud Run supports GPU configuration for AI inference workloads.\u003c/p\u003e\n"]]],[],null,["# About GPUs on Google Cloud\n\n*** ** * ** ***\n\nGoogle Cloud is focused on delivering world-class artificial intelligence (AI)\ninfrastructure to power your most demanding GPU-accelerated workloads across a\nwide range of segments. You can use GPUs on Google Cloud to run AI, machine\nlearning (ML), scientific, analytics, engineering, consumer, and enterprise\napplications.\n\nThrough our partnership with NVIDIA, Google Cloud delivers the latest GPUs while\noptimizing the software stack with a wide array of storage and networking\noptions. For a full list of GPUs available, see [GPU platforms](/compute/docs/gpus).\n\nThe following sections outline the benefits of GPUs on Google Cloud.\n\nGPU-accelerated VMs\n-------------------\n\nOn Google Cloud, you can access and provision GPUs in the way that best suits\nyour needs. A specialized [accelerator-optimized machine family](/compute/docs/accelerator-optimized-machines) is\navailable, with pre-attached GPUs and networking capabilities that are ideal for\nmaximizing performance. These are available in the A4X, A4, A3, A2, and G2\nmachine series.\n\nMultiple provisioning options\n-----------------------------\n\nYou can provision clusters by using the accelerator-optimized machine family\nwith any of the following open-source or Google Cloud products.\n\n### Vertex AI\n\nVertex AI is a fully-managed machine learning (ML) platform that you\ncan use to train and deploy ML models and AI applications. In Vertex AI\napplications, you can use GPU-accelerated VMs to improve performance in the\nfollowing ways:\n\n- [Use GPU-enabled VMs](/vertex-ai/docs/training/configure-compute) in custom training GKE worker pools.\n- Use [open source LLM models from the Vertex AI Model Garden](/vertex-ai/generative-ai/docs/open-models/use-open-models).\n- Reduce [prediction](/vertex-ai/docs/predictions/configure-compute#gpus) latency.\n- Improve performance of [Vertex AI Workbench](/vertex-ai/docs/workbench/instances/change-machine-type) notebook code.\n- Improve performance of a [Colab Enterprise runtime](/colab/docs/create-runtime-template).\n\n### Cluster Director\n\nCluster Director (formerly known as *Hypercompute Cluster* ) is a set of\nfeatures and services that are designed to let you deploy and manage large\nnumbers, up to tens of thousands, of accelerator and networking resources that\nfunction as a single homogeneous unit. This option is ideal for creating a\ndensely allocated, performance-optimized infrastructure that has integrations\nfor Google Kubernetes Engine (GKE) and Slurm schedulers. Cluster Director helps\nyou to build an infrastructure that is specifically designed for running AI, ML,\nand HPC workloads. For more information, see [Cluster Director](/ai-hypercomputer/docs/hypercompute-cluster).\n\nTo get started with Cluster Director, see [Choose a deployment strategy](/ai-hypercomputer/docs/choose-strategy).\n\n### Compute Engine\n\nYou can also create and manage individual VMs or small clusters of VMs with\nattached GPUs on Compute Engine. This method is mostly used for running\ngraphics-intensive workloads, simulation workloads, or small-scale ML model\ntraining.\n\nThe following table shows the methods that you can use to create VMs that have\nGPUs attached:\n\n### Cloud Run\n\nYou can configure GPUs for your Cloud Run instances. GPUs are ideal for\nrunning AI inference workloads using large language models on Cloud Run.\n\nOn Cloud Run, consult these resources for running AI workloads on GPUs:\n\n- [Configure GPUs for a Cloud Run service](/run/docs/configuring/services/gpu)\n- [Load large ML models on Cloud Run with GPUs](/run/docs/configuring/services/gpu-best-practices#model-loading-recommendations)\n- [Tutorial: Run LLM inference on Cloud Run GPUs with Ollama](/run/docs/tutorials/gpu-gemma2-with-ollama)"]]