NVIDIA DGX Cloud

NVIDIA DGX Cloud

  • Fully Managed AI Platform: Access to NVIDIA’s high-performance AI infrastructure without the need for on-premises hardware.

  • Advanced GPU Clusters: Each instance includes 8 NVIDIA H100 or A100 80GB Tensor Core GPUs, totaling 640GB of GPU memory, suitable for training large-scale AI models.

  • High-Speed Networking: Utilizes NVIDIA NVLink and NVSwitch technologies for high-bandwidth, low-latency interconnects between GPUs, enhancing multi-node training performance.

  • Optimized Storage Solutions: Equipped with high-performance NVMe storage to support demanding AI workloads.

  • DGX Cloud Create: A Kubernetes-based platform for orchestrating AI workloads, facilitating efficient training and fine-tuning of models.

  • Serverless Inference: Deploy AI models with automatic scaling and efficient GPU utilization, eliminating the need for managing underlying infrastructure.

  • Benchmarking Tools: Provides templates and guidelines for evaluating model performance, supporting scalability up to 2,048 GPUs.

  • Multi-Cloud Support: Available through major cloud providers like Oracle Cloud Infrastructure, Microsoft Azure, Google Cloud, and Amazon Web Services, offering flexibility and global scalability.

  • Expert Support: Direct collaboration with NVIDIA engineers to optimize model performance and deployment strategies.

  • Predictable Pricing: Transparent monthly pricing starting at $36,999 per instance, inclusive of hardware, software, storage, and 24/7 support.

Diagram illustrating NVIDIA DGX Cloud's unified AI platform, showcasing layers for AI development, deployment, and scalable GPU infrastructure across multi-cloud environments.

NVIDIA DGX Cloud: A Unified Cloud Platform for AI Development and Deployment

Overview

NVIDIA DGX Cloud is a comprehensive cloud platform developed by NVIDIA, designed to enable organizations to access advanced computational power for AI training and inference without the need for complex on-premises infrastructure.

Technical Specifications

  • Graphics Processing Units (GPUs):
    Each DGX Cloud instance includes 8 NVIDIA H100 or A100 80GB Tensor Core GPUs, providing a total of 640 GB of GPU memory.
  • Networking and Connectivity:
    Utilizes NVIDIA NVLink and NVSwitch technologies to deliver high-bandwidth inter-GPU communication, combined with RDMA networking to minimize latency and maximize data transfer speeds.
  • Storage:
    Equipped with high-performance NVMe storage optimized for demanding AI workloads.

Infrastructure Components of DGX Cloud

  • Management Software:
    NVIDIA’s Base Command Platform is employed to orchestrate and monitor AI training workloads efficiently.
  • High-Performance Networking:
    DGX Cloud leverages low-latency, high-bandwidth networking to enable seamless scalability for distributed AI tasks.

Key Features and Services

  • DGX Cloud Create:
    A Kubernetes-based AI workload orchestration platform that facilitates training and fine-tuning of models. This fully managed service includes both a user-friendly graphical interface and command-line tools for interacting with GPU clusters.
  • Serverless Inference:
    Enables scalable and efficient deployment of AI models without the need to manage infrastructure, leveraging automatic GPU resource optimization.
  • Benchmarking Capabilities:
    Offers ready-to-use templates and guidelines to benchmark AI models such as Llama 3.1, Grok1, and NeMo Megatron at scale, supporting configurations up to 2,048 GPUs.

Collaborations with Cloud Service Providers

NVIDIA DGX Cloud is integrated with leading cloud service providers including Oracle Cloud Infrastructure (OCI), Microsoft Azure, Google Cloud, and Amazon Web Services (AWS), offering global-scale access to high-performance computing resources.

Use Cases and Clients

  • Amgen:
    Utilizes DGX Cloud in conjunction with the BioNeMo software suite to accelerate biologics drug discovery.
  • Cerence:
    Trains automotive-specific large language models to enhance in-vehicle user experiences.
  • ServiceNow:
    Develops intelligent virtual assistants and customer service agents using DGX Cloud infrastructure.
  • Deloitte:
    Employs DGX Cloud and NVIDIA BioNeMo to advance large language models in pharmaceutical research and drug development.

Key Advantages

  • Access to NVIDIA Expertise:
    Clients benefit from direct collaboration with NVIDIA engineers to optimize model performance and deployment.
  • Multi-Cloud Support:
    DGX Cloud is available across various cloud platforms, including Oracle Cloud Infrastructure, Microsoft Azure, and Google Cloud, offering flexibility and resilience.

Pricing

Each DGX Cloud instance, equipped with 8 GPUs and 640 GB of GPU memory, starts at $36,999 per month.
This pricing includes hardware, software, storage, and 24/7 expert support.

Conclusion

NVIDIA DGX Cloud provides a robust end-to-end solution for organizations aiming to accelerate and scale their AI initiatives.

With state-of-the-art infrastructure, cutting-edge software platforms, and specialized support, DGX Cloud empowers enterprises to fast-track AI innovation across industries.

NVIDIA DGX Cloud

Category

Details

GPU Nodes

8× NVIDIA A100 80 GB or H100 80 GB Tensor Core GPUs (640 GB total)

Memory & Storage

10 TB storage per instance; scalable egress/bandwidth (10 TB/month baseline)

Network Fabric

High-speed, low-latency interconnect for multi-node scaling

Software Platform

NVIDIA Base Command Platform, AI Enterprise, NIM APIs, NeMo Curator, serverless inference, benchmarking

Support & Services

24/7 expert support, technical account & customer success managers, single-point contact

Pricing

Predictable monthly rate including hardware, software, storage, egress, support

Hybrid & Multi‑Cloud

Deployable across public clouds and on-premise via unified Base Command interface

 

Resources

Continue Exploring

 
  • Fully Managed Multi‑Node AI Platform

    High-performance GPU clusters (8× A100/H100, 640 GB total GPU memory) delivered as a service with turnkey deployment .

  • NVIDIA‑Optimized Software Stack

    Powered by NVIDIA Base Command Platform and AI Enterprise software—includes NIM microservices, NeMo Curator, serverless inference, and benchmarking workflows .

  • Serverless Inference with Autoscaling

    Scales down to zero during inactivity, reducing costs and enabling flexible deployment via API/CLI/UI .

  • Cloud‑Agnostic Hybrid Integration

    Available on multiple cloud partners with unified management across cloud and on-premises environments .

  • Expert Support & Predictable Pricing

    Includes 24/7 support, dedicated technical account manager, and transparent monthly pricing covering compute, storage, egress, software, and consulting .

  • DGX Cloud Lepton Marketplace Access

    Enables on‑demand access to GPU capacity from a global cloud-provider network, with real-time health insights and region-based workload sovereignty .

Diagram illustrating NVIDIA DGX Cloud's unified AI platform, showcasing layers for AI development, deployment, and scalable GPU infrastructure across multi-cloud environments.

NVIDIA DGX Cloud

  • World’s first fully managed AI supercomputer in the cloud, delivering multi-node GPU clusters via browser or API

  • Provides instant access to 8× A100 or H100 GPUs per node (640 GB GPU memory), integrated with NVIDIA-optimized software and expert support

Related Products