NVIDIA GB200 NVL72

The NVIDIA GB200 NVL72, built on the new Blackwell architecture paired with Arm-based Grace CPU, is capable of delivering delivering 30X faster real-time
large language model (LLM) inference, 25X lower TCO, and consumes 25X less energy.
We will include pricing when this product will be available to the public.

With NVIDIA DGX B300, enterprises can equip their data scientists and developers with a universal AI supercomputer to accelerate their time to insight and fully realize the benefits of AI for their businesses.

2x Intel Xeon 6776P (64 cores, 2.3GHz), 8x NVIDIA B300 SXM, 2TB RAM, 30TB NVMe (Data), 8x 400Gb HDR Infiniband, 4x 400Gb Ethernet, 3 years support

// solution overview

NVIDIA GB200 NVL72 System

The NVIDIA GB200 NVL72 is a perfect solution for large-scale AI applications as it can handle even trillion-parameter datasets with ease. Based on the new Blackwell architecture paired with Arm Grace CPU, the final superchip is capable of groundbreaking performance while keeping the TCO and power consumption up to 25x lower compared to previous generation systems.

performance

Configuration flexibility

TCO

premium support

Substantial performance gain with GB200 NVL72

// Key features

LLM Inference

30X

vs H100 Tensor Core GPU*

LLM Training

4X

vs H100*

Energy Efficiency

25X

vs H100*

Data Processing

18X

vs CPU*

*NVIDIA: LLM inference and energy efficiency: TTL = 50 milliseconds (ms) real time, FTL = 5s, 32,768 input/1,024 output, NVIDIA HGX™ H100 scaled over InfiniBand (IB) vs. GB200 NVL72, training 1.8T MOE 4096x HGX H100 scaled over IB vs. 456x GB200 NVL72 scaled over IB. Cluster size: 32,768
A database join and aggregation workload with Snappy / Deflate compression derived from TPC-H Q4 query. Custom query implementations for x86, H100 single GPU and single GPU from GB200 NLV72 vs. Intel Xeon 8480+
Projected performance subject to change.

NVIDIA GB200 NVL72 Configurations

// Hardware

	GB200 NVL72	GB200 Superchip
Configuration	36x Grace CPU, 72x B200 GPU	1x Grace CPU, 2x B200 GPU
FP4 Tensor Core*	1,440 PFLOPS	40 PFLOPS
FP8 / FP6 Tensor Core*	720 PFLOPS	20 PFLOPS
INT8 Tensor Core*	720 POPS	20 POPS
FP16 / BF16 Tensor Core*	360 PFLOPS	10 PFLOPS
TF32 Tensor Core*	180 PFLOPS	5 PFLOPS
FP64 Tensor Core	3,240 TFLOPS	90 TFLOPS
GPU Memory	Up to 13.5 TB HBM3e, 576 TBps	Up to 384 GB HBM3e, 16 TBps
NVLink Bandwidth	130 TBps	3.6 TBps
CPU Cores	2,952 Arm Neoverse V2 Cores	72 Arm Neoverse V2 Cores
CPU Memory	Up to 17 TB LPDDR5X, Up to 18.4 TBps	Up to 480 GB LPDDR5X, Up to 18.4 TB/s
Product info	Datasheet

// Enterprise scale solutions

Supercomputing for any business with ease

GB200 NVL72 supports advanced networking options at speeds up to 800 gigabits per second (Gb/s). For the highest AI performance, GB200 supports the latest NVIDIA Quantum-X800 InfiniBand and Spectrum™-X800 Ethernet platforms. GB200 NVL72 also includes NVIDIA BlueField-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds.

NVIDIA GB200 NVL72

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Resources

Continue Exploring

Technical walkthrough on how to get started with NVIDIA Quantum-X800.

NVIDIA Silicon Photonics Resources

NVIDIA GB200 NVL72

// solution overview

NVIDIA GB200 NVL72 System

Substantial performance gain with GB200 NVL72

// Key features

LLM Inference

30X

LLM Training

4X

Energy Efficiency

25X

Data Processing

18X

NVIDIA GB200 NVL72 Configurations

// Hardware

// Enterprise scale solutions

Supercomputing for any business with ease

Resources

Continue Exploring

NEED A CONSULTATION?

AI servers

NVIDIA DGX B200

AI Server 1 GPU

AI Server 4 GPU

AI Server 8 GPU

NVIDIA DGX Cloud

Internal Links

Nvidia GPU Selector

Our References

Our Contacts

External Links

NVIDIA NGC Catalog

AMD EPYC 4th Generation

NVIDIA DataCenter GPUs

NVIDIA DLI Trainings

NVIDIA AI Platform