NVIDIA L4

Ultra-efficient universal GPU for AI, video, graphics, and virtual workstations
Powered by NVIDIA Ada Lovelace architecture for next-gen performance
4th-gen Tensor Cores with FP8 precision for up to 4x faster AI inference
3rd-gen RT Cores for real-time ray tracing and neural graphics
Supports AV1 encoding/decoding with 4 video decoders and 2 encoders
Handles over 1,000 simultaneous 720p30 video streams per server
24GB GDDR6 memory with 300 GB/s bandwidth
Low-profile, single-slot design – fits standard PCIe Gen4 x16 slots
Consumes only 72W – perfect for dense and energy-conscious deployments
Passive cooling design – silent and efficient operation
Full vGPU support for multi-user virtualization (vPC, vWS, vCS)
Ideal for cloud gaming, AI inference, remote desktops, and edge computing
Certified with NVIDIA’s enterprise and partner ecosystem

NVIDIA L4 Tensor Core GPU: Universal Acceleration for AI, Video, and Graphics

The NVIDIA L4 Tensor Core GPU, built on the groundbreaking Ada Lovelace architecture, is a transformative universal accelerator designed to meet the demanding needs of modern enterprises across the data center, cloud, and edge. As NVIDIA’s most efficient and versatile low-profile GPU, the L4 delivers exceptional performance for a broad spectrum of workloads, including AI inference, video processing, graphics rendering, and virtual workstations—all while maintaining industry-leading energy efficiency.

Compact Power for Any Deployment

With its single-slot, low-profile form factor, the NVIDIA L4 is engineered to integrate seamlessly into mainstream PCIe-based servers, making it the ideal solution for organizations looking to introduce or expand GPU acceleration in CPU-based environments. Whether deployed in hyperscale data centers, cloud platforms, or edge computing scenarios, the L4 ensures unmatched scalability, density, and power optimization.

Key Specifications at a Glance

FP32 Performance: 30.3 TFLOPS

TF32 Tensor Core: 120 TFLOPS (with sparsity)

FP16 Tensor Core: 242 TFLOPS (with sparsity)

BFLOAT16 Tensor Core: 242 TFLOPS (with sparsity)

FP8 Tensor Core: 485 TFLOPS (with sparsity)

INT8 Tensor Core: 485 TOPS (with sparsity)

GPU Memory: 24 GB GDDR6

Memory Bandwidth: 300 GB/s

NVENC/NVDEC: 2 NVENC, 4 NVDEC, 4 JPEG Decoders, AV1 Encode & Decode

Form Factor: Single-slot, Low-profile

System Interconnect: PCIe Gen4 x16 (64 GB/s)

Power Consumption: 72 W

Cooling: Passive

Display Outputs: None (vGPU only)

Server Options: Compatible with NVIDIA-Certified systems (1–8 GPUs)

Next-Generation Tensor Cores: AI Performance Redefined

At the heart of the L4 GPU are fourth-generation Tensor Cores, purpose-built for AI workloads. These cores offer support for newer data formats, including FP8, enabling up to 4x faster inference performance compared to the previous generation (such as the NVIDIA T4). This advancement is crucial for modern AI tasks including intelligent virtual assistants, generative models, recommendation systems, and real-time language processing.

The use of FP8 and structured sparsity significantly reduces memory requirements while accelerating computational throughput. Developers and data scientists benefit from faster model training and inference cycles, lower latency, and improved performance across AI pipelines.

Third-Generation RT Cores for Real-Time Rendering

NVIDIA pioneered real-time ray tracing with the introduction of RT Cores, and the L4 takes it even further with third-generation RT Cores that double ray-triangle intersection performance. Combined with Shader Execution Reordering (SER), the L4 GPU enables high-performance neural graphics, immersive virtual environments, and realistic lighting simulations with unprecedented speed and realism.

This makes the L4 an ideal choice for applications in cloud gaming, digital content creation, and engineering visualization where quality and responsiveness are paramount.

Optimized for Advanced Video and Vision AI Workloads

NVIDIA L4 is optimized for video-intensive applications. It includes four video decoders, two video encoders, and support for AV1 video encoding/decoding, allowing for more than 1,000 simultaneous 720p30 video streams per server. This capability significantly outperforms traditional CPU-based solutions, offering over 120 times more video AI pipeline performance.

Additionally, the inclusion of four dedicated JPEG decoders accelerates vision AI applications, making L4 ideal for smart city surveillance, healthcare imaging, retail analytics, and media processing.

Energy Efficiency and Density That Scale

With a TDP of just 72 watts, the L4 delivers performance-per-watt that is unmatched in its class. This makes it an ideal solution for dense server configurations and energy-conscious deployments. Whether you’re deploying in high-density racks or at the network edge, the L4 delivers the performance and efficiency needed to scale your AI, graphics, and video workloads sustainably.

Its passive cooling solution and compact design further reduce infrastructure complexity and cost, while allowing up to eight GPUs per server to be deployed in supported systems.

Enterprise-Ready Virtualization and vGPU Support

The NVIDIA L4 is fully compatible with the NVIDIA virtual GPU (vGPU) platform, offering support for:

NVIDIA Virtual PC (vPC)
NVIDIA RTX Virtual Workstation (vWS)
NVIDIA Virtual Compute Server (vCS)
NVIDIA GRID and Virtual Applications (vApps)

With vGPU profiles ranging from 1 GB to 24 GB, the L4 enables robust multi-user environments for remote design, engineering, rendering, and data science. Enterprise IT teams can deploy secure, high-performance virtual workstations that offer native-like user experiences from virtually anywhere.

Expansive Use Cases

The L4 GPU serves as a universal platform across diverse industries and workloads. Some of its key application domains include:

AI Inference at Scale: Power recommendation engines, chatbots, NLP models, and vision systems.
Media and Broadcast: Handle real-time transcoding, multi-stream encoding/decoding, and broadcast automation.
Virtual Workstations: Enable professionals in architecture, design, and manufacturing to run complex 3D applications remotely.
Cloud Gaming and AR/VR: Deliver photorealistic experiences and low-latency interactivity for gamers and developers.
Contact Center Automation: Support virtual agents and speech-based customer service solutions powered by real-time AI.
Scientific Research: Accelerate biomolecular simulations and advanced data analytics in high-performance computing environments.

The Power of the NVIDIA AI Platform

NVIDIA’s AI platform is the most comprehensive in the industry, combining hardware, software, and ecosystem support to drive AI transformation. The L4 benefits from optimized frameworks, libraries, and tools, including TensorRT, DeepStream, CV-CUDA, and CUDA-X AI, enabling rapid development and deployment of advanced AI models.

It is also supported by NVIDIA’s extensive developer ecosystem, providing documentation, SDKs, sample projects, and cloud-native tools to speed time to deployment.

Built for Enterprise-Class Reliability and Security

The L4 is built to meet the reliability and security standards required for enterprise IT infrastructure. From measured boot with hardware root of trust to NVIDIA’s rigorous validation and certification process, the L4 ensures data center-level dependability.

It is also fully tested by NVIDIA and its partners for compatibility with major enterprise applications and platforms, helping IT teams deploy confidently across varied workloads and industries.

Conclusion

The NVIDIA L4 Tensor Core GPU redefines what’s possible with low-profile data center GPUs. With unparalleled versatility, efficiency, and performance across AI, video, graphics, and virtualization workloads, the L4 is positioned as the go-to accelerator for modern data-driven organizations.

Whether you’re powering immersive virtual experiences, accelerating AI pipelines, or deploying edge applications at scale, the L4 delivers the universal performance platform that enterprises need to stay ahead in the age of intelligent computing.

NVIDIA L4

FP32 Performance: 30.3 TFLOPS

TF32 Tensor Core: 120 TFLOPS (with sparsity)

FP16 Tensor Core: 242 TFLOPS (with sparsity)

BFLOAT16 Tensor Core: 242 TFLOPS (with sparsity)

FP8 Tensor Core: 485 TFLOPS (with sparsity)

INT8 Tensor Core: 485 TOPS (with sparsity)

GPU Memory: 24 GB GDDR6

Memory Bandwidth: 300 GB/s

NVENC/NVDEC: 2 NVENC, 4 NVDEC, 4 JPEG Decoders, AV1 Encode & Decode

Form Factor: Single-slot, Low-profile

System Interconnect: PCIe Gen4 x16 (64 GB/s)

Power Consumption: 72 W

Cooling: Passive

Display Outputs: None (vGPU only)

Server Options: Compatible with NVIDIA-Certified systems (1–8 GPUs)

Resources

Continue Exploring

Optimized for AI Inference and Video Workloads

The NVIDIA L4 GPU is purpose-built to deliver breakthrough performance for AI inference, video processing, and graphics workloads. Designed for efficiency and versatility, it brings powerful acceleration to modern data centers and edge environments.
Energy-Efficient Performance with Low-Profile Design

Featuring a low-profile, single-slot form factor and just 72W of power consumption, the L4 is ideal for space-constrained servers and edge deployments where energy efficiency and high performance are critical.
Accelerated AI and Deep Learning Inference

With support for TensorRT and optimized libraries, the L4 delivers exceptional performance for AI inference tasks, including natural language processing, computer vision, recommendation systems, and more.
Exceptional Video Processing Capabilities

Equipped with powerful video decode and encode engines, including AV1 support, the L4 enables high-throughput, low-latency video analytics, streaming, and transcoding — making it ideal for media, surveillance, and broadcast applications.
Built on NVIDIA Ada Lovelace Architecture

Leveraging the Ada Lovelace architecture, the L4 provides advanced AI, graphics, and video capabilities with improved efficiency and scalability across enterprise workloads.
Multi-Workload Versatility

Whether used for AI inference, graphics rendering, remote desktops, or cloud gaming, the L4 delivers a flexible solution for a wide range of applications in both virtualized and bare-metal environments.
NVIDIA AI Enterprise Compatibility

Fully supported by NVIDIA AI Enterprise software, the L4 ensures seamless integration into AI pipelines with enterprise-grade tools and optimized frameworks.
Data Center Ready and Scalable

With its compact design and low power envelope, the L4 is deployable at scale in modern data centers, offering a cost-effective and scalable GPU solution for AI and video-centric workloads.

NVIDIA L4

GPU memory size: 24 GB GDDR6
Thermal Solution: Passive
Form Factor: Single Slot | Low-Profile

NVIDIA L4

NVIDIA L4 Tensor Core GPU: Universal Acceleration for AI, Video, and Graphics

Compact Power for Any Deployment

Key Specifications at a Glance

Next-Generation Tensor Cores: AI Performance Redefined

Third-Generation RT Cores for Real-Time Rendering

Optimized for Advanced Video and Vision AI Workloads

Energy Efficiency and Density That Scale

Enterprise-Ready Virtualization and vGPU Support

Expansive Use Cases

The Power of the NVIDIA AI Platform

Built for Enterprise-Class Reliability and Security

Conclusion

Resources

Continue Exploring

NVIDIA L4

Related Products

Are you ready to unlock your network Capability?

Quick Access

Home

Orders

Account

Cart

Blog

Contact us

Categories

Server

Storage

Networking

Wireless

Access Point

Router

Brands

HP

Dell

Lenovo

Cisco

Mikrotik

Huawei

Privacy

Careers

Terms