NVIDIA GB200 NVL72

The NVIDIA GB200 NVL72, built on the new Blackwell architecture paired with Arm-based Grace CPU, is capable of delivering delivering 30X faster real-time
large language model (LLM) inference, 25X lower TCO, and consumes 25X less energy.
We will include pricing when this product will be available to the public.

With NVIDIA DGX B300, enterprises can equip their data scientists and developers with a universal AI supercomputer to accelerate their time to insight and fully realize the benefits of AI for their businesses.

2x Intel Xeon 6776P (64 cores, 2.3GHz), 8x NVIDIA B300 SXM, 2TB RAM, 30TB NVMe (Data), 8x 400Gb HDR Infiniband, 4x 400Gb Ethernet, 3 years support

// solution overview

NVIDIA GB200 NVL72 System

The NVIDIA GB200 NVL72 is a perfect solution for large-scale AI applications as it can handle even trillion-parameter datasets with ease. Based on the new Blackwell architecture paired with Arm Grace CPU, the final superchip is capable of groundbreaking performance while keeping the TCO and power consumption up to 25x lower compared to previous generation systems.
performance
Configuration flexibility
TCO
premium support

Substantial performance gain with GB200 NVL72

// Key features

LLM Inference

30X

vs H100 Tensor Core GPU*
 

LLM Training

4X

vs H100*
 

Energy Efficiency

25X

vs H100*
 

Data Processing

18X

vs CPU*
 
 

*NVIDIA: LLM inference and energy efficiency: TTL = 50 milliseconds (ms) real time, FTL = 5s, 32,768 input/1,024 output, NVIDIA HGX™ H100 scaled over InfiniBand (IB) vs. GB200 NVL72, training 1.8T MOE 4096x HGX H100 scaled over IB vs. 456x GB200 NVL72 scaled over IB. Cluster size: 32,768
A database join and aggregation workload with Snappy / Deflate compression derived from TPC-H Q4 query. Custom query implementations for x86, H100 single GPU and single GPU from GB200 NLV72 vs. Intel Xeon 8480+
Projected performance subject to change.

NVIDIA GB200 NVL72 Configurations

// Hardware

GB200 NVL72GB200 Superchip
Configuration36x Grace CPU, 72x B200 GPU1x Grace CPU, 2x B200 GPU
FP4 Tensor Core*1,440 PFLOPS40 PFLOPS
FP8 / FP6 Tensor Core*720 PFLOPS20 PFLOPS
INT8 Tensor Core*720 POPS20 POPS
FP16 / BF16 Tensor Core*360 PFLOPS10 PFLOPS
TF32 Tensor Core*180 PFLOPS5 PFLOPS
FP64 Tensor Core3,240 TFLOPS90 TFLOPS
GPU MemoryUp to 13.5 TB HBM3e, 576 TBpsUp to 384 GB HBM3e, 16 TBps
NVLink Bandwidth130 TBps3.6 TBps
CPU Cores2,952 Arm Neoverse V2 Cores72 Arm Neoverse V2 Cores
CPU MemoryUp to 17 TB LPDDR5X, Up to 18.4 TBpsUp to 480 GB LPDDR5X, Up to 18.4 TB/s
Product infoDatasheet

// Enterprise scale solutions

Supercomputing for any business with ease

GB200 NVL72 supports advanced networking options at speeds up to 800 gigabits per second (Gb/s). For the highest AI performance, GB200 supports the latest NVIDIA Quantum-X800 InfiniBand and Spectrum™-X800 Ethernet platforms. GB200 NVL72 also includes NVIDIA BlueField-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds.
NVIDIA GB200 NVL72

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Resources

Continue Exploring

 

Technical walkthrough on how to get started with NVIDIA Quantum-X800.

NVIDIA Silicon Photonics Resources