NVIDIA L40S

  • Powered by the advanced Ada Lovelace architecture
  • Massive 48GB GDDR6 ECC memory for large AI and graphics workloads

  • Ultra-fast 864 GB/s memory bandwidth

  • Features 18,176 CUDA cores for extreme parallel processing

  • Includes 568 Tensor Cores optimized for AI training & inference

  • 142 RT Cores for real-time ray tracing and rendering

  • Incredible FP8 performance up to 14,662 TFLOPS

  • Supports AV1 encode/decode with 3 NVENC + 3 NVDEC engines

  • vGPU software support for virtualized environments

  • Ideal for LLM, generative AI, 3D rendering, and video workflows

  • Passive cooling for optimal data center integration

  • Certified NEBS Level 3 for telecom and enterprise environments

  • Backed by Secure Boot with Root of Trust for enhanced security

  • Dual-slot form factor, 350W power-efficient design

  • Designed for AI, graphics, video, and simulation—all-in-one GPU

NVIDIA L40S -Q9

NVIDIA L40S GPU: Ultimate AI and Graphics Powerhouse for Modern Data Centers

In the evolving landscape of artificial intelligence and accelerated computing, the NVIDIA L40S GPU stands as a new pinnacle of performance, versatility, and innovation. Engineered on the cutting-edge Ada Lovelace architecture, the L40S is designed to address the growing demands of multi-modal generative AI, large language model (LLM) operations, and advanced visual computing in data centers. From high-performance inference and training to immersive graphics and ultra-efficient video processing, the L40S delivers universal acceleration across diverse workloads making it one of the most versatile and capable GPUs available in the market today.

Designed for the AI-Powered Future

The NVIDIA L40S isn’t just a graphics card it’s a fully integrated AI and visual computing platform. As AI applications become more complex and resource-intensive, especially with the rapid adoption of generative AI across industries, the need for specialized hardware capable of handling diverse operations simultaneously is more critical than ever.

The L40S addresses this challenge by providing:

  • End-to-end acceleration for AI workflows: Whether it’s training massive language models, deploying real-time inferencing at scale, or rendering high-quality visuals, the L40S handles it all with remarkable efficiency.
  • Exceptional graphics and video performance: Supporting real-time ray tracing and AV1 video codec, this GPU delivers advanced capabilities for rendering, simulation, content creation, and XR (extended reality) environments.

Core Specifications and Architecture

At the heart of the L40S lies the Ada Lovelace architecture, NVIDIA’s latest and most advanced GPU architecture. Purpose-built for AI and graphics convergence, Ada Lovelace introduces architectural enhancements that significantly elevate both throughput and energy efficiency.

Key Specifications

  • GPU Memory: 48 GB of ultra-fast GDDR6 with ECC (Error-Correcting Code) ensures optimal performance and data integrity, even under the most demanding workloads. This large memory capacity is ideal for handling massive datasets and models without memory bottlenecks.
  • Memory Bandwidth: A blazing 864 GB/s, ensuring data flows swiftly between the GPU cores and memory to maximize real-time performance.
  • CUDA Cores: With 18,176 CUDA cores, the L40S provides massive parallel processing power, suitable for compute-intensive AI and simulation tasks.
  • Tensor Cores: Includes 568 fourth-generation Tensor Cores, purpose-built to accelerate matrix operations essential for AI workloads such as deep learning training and inference.
  • RT Cores: Features 142 third-generation RT Cores that dramatically enhance ray tracing performance, delivering realistic lighting, shadows, and reflections for professional visualization applications.

Performance Metrics

The L40S GPU delivers breakthrough performance across all major floating point and integer precisions used in modern AI workloads:

  • FP32 (Single Precision): 91.6 TFLOPS
  • TF32 Tensor Core: 366 TFLOPS
  • BFLOAT16 / FP16 Tensor Core: Up to 7,332 TFLOPS (sparsity enabled)
  • FP8 Tensor Core: Up to 14,662 TFLOPS (sparsity enabled)
  • INT8 / INT4 Peak Tensor Performance: Up to 14,662 TOPS

These figures underscore the L40S’s capability to accelerate both high-precision training and low-precision inferencing with unmatched efficiency.

Advanced Video Encoding and Decoding

The NVIDIA L40S is equipped with 3 NVENC (NVIDIA Encoder) and 3 NVDEC (NVIDIA Decoder) engines. These support AV1 encoding and decoding, enabling superior video quality and reduced bandwidth for streaming, conferencing, cloud gaming, and AI video analytics.

  • AV1 Support: A newer, more efficient video codec than H.264/HEVC, AV1 enables better compression at the same image quality, which is vital for data centers focused on real-time streaming and cloud rendering.

High-Performance Form Factor and Efficiency

  • Form Factor: The L40S is housed in a dual-slot design measuring 4.4″ in height and 10.5″ in length compact enough to be compatible with modern server enclosures and workstations.
  • Thermal Design: The card uses a passive cooling solution, relying on chassis airflow for heat dissipation, which makes it ideal for data center deployment where custom airflow management is in place.
  • Power Consumption: It consumes a maximum of 350 watts and utilizes a 16-pin power connector, delivering high performance without excessive power draw.
  • Secure Boot: Equipped with Secure Boot with Root of Trust, it ensures a trusted execution environment, safeguarding sensitive AI models and enterprise applications.

Virtualization and Enterprise Features

The NVIDIA L40S fully supports NVIDIA Virtual GPU (vGPU) software, allowing multiple users or virtual machines (VMs) to share the powerful GPU resources efficiently. This is particularly beneficial for:

  • Cloud desktops for designers, engineers, and developers
  • Virtualized AI workloads
  • Hosted inference and training environments
  • Multi-user simulation platforms

Reliability and Compliance

  • NEBS Ready – Level 3: The L40S is qualified for NEBS Level 3, which means it meets strict environmental and reliability standards for telecom and mission-critical deployments.
  • MIG (Multi-Instance GPU) Support: While the L40S does not support MIG, its immense raw power often makes it suitable for dedicated workloads where full GPU access is preferable over segmentation.

Generative AI: Multi-Modal Powerhouse

As the demand for multi-modal generative AI applications increases, the L40S becomes an essential asset for developers and enterprises. Whether you’re building:

  • Audio and speech synthesis systems
  • Text-to-image and text-to-video generators
  • 3D model generators using LLMs and diffusion models
  • Real-time avatar and digital twin simulations

…the L40S can accelerate the entire workflow from model training and fine-tuning to inference and rendering with one unified platform.

Who Should Use the NVIDIA L40S?

The L40S is designed for:

  • AI Researchers and Developers building and deploying large language models and generative AI systems.
  • Data Center Architects looking for a universal GPU that can support diverse workloads without the need for multiple specialized cards.
  • Media & Entertainment Studios that require high-end real-time rendering and AI-enhanced visual effects.
  • Enterprises implementing AI-powered analytics, customer service, simulation, and modeling platforms.

Conclusion

The NVIDIA L40S GPU redefines what’s possible with a universal accelerator for data centers. With industry-leading performance across AI, graphics, and video workloads, it’s a game-changer for organizations aiming to unlock the full potential of artificial intelligence and real-time rendering in a single, powerful platform.

Whether you’re scaling a hyperscale AI infrastructure, developing multi-modal applications, or deploying secure virtualized environments, the NVIDIA L40S delivers the performance, reliability, and future-readiness to power your most ambitious projects.

NVIDIA L40S
  • GPU Architecture: Ada Lovelace

  • CUDA Cores: 18,176

  • Tensor Cores: Fourth Generation

  • RT Cores: Third Generation

  • GPU Memory: 48 GB GDDR6 with ECC

  • Memory Interface Width: 384-bit

  • Memory Bandwidth: Up to 864 GB/s

  • Interface: PCI Express Gen 4.0 x16

  • Maximum Power Consumption: 350 Watts

  • Form Factor: Dual-slot, full-height

  • Cooling: Passive (airflow from server)

  • Display Outputs: None (data center optimized)

  • Virtualization: Supported (NVIDIA RTX vWS, MIG capable)

  • Compute Precision Support: FP32, TF32, FP16, INT8, INT4

  • Software Support: NVIDIA CUDA®, cuDNN, TensorRT, RTX™ technology stack

Resources

Continue Exploring

 
  • Built for Generative AI and Graphics at Scale

    The NVIDIA L40S is engineered to meet the demanding needs of modern AI workloads and professional graphics. Powered by the Ada Lovelace architecture, it enables large-scale inference, training, and advanced graphics workflows from a single, versatile platform.

  • Massive 48 GB GDDR6 ECC Memory

    With 48 GB of error-correcting GDDR6 memory, the L40S is equipped to handle high-resolution graphics, large AI models, and multi-application workloads with smooth and reliable performance.

  • Fourth-Generation Tensor Cores & Third-Generation RT Cores

    Experience dramatic improvements in AI performance and ray tracing speed with the L40S’s cutting-edge core architecture. It accelerates deep learning, computer vision, and physically accurate rendering like never before.

  • High Performance for Inference and Training

    Delivering up to 1.45 petaflops of tensor performance, the L40S enables fast AI inference and supports light training tasks, making it ideal for edge AI, virtual workstations, and datacenter deployment.

  • PCI Express Gen 4.0 Interface

    The L40S supports PCIe Gen 4.0 for high-speed connectivity and data transfer, maximizing system responsiveness in heavy-load environments.

  • NVIDIA Virtualization Support

    With support for NVIDIA RTX™ Virtual Workstation (vWS), the L40S allows multiple users to access powerful GPU resources simultaneously in a secure and scalable manner.

  • Optimized for Enterprise Data Centers

    Designed for 24/7 operation with enterprise-grade reliability, advanced cooling, and power efficiency, the L40S is ideal for integration in demanding data center infrastructure.

NVIDIA L40S -Q9

NVIDIA L40S

  • GPU memory size: 48GB GDDR6 with ECC
  • Thermal Solution: Passive
  • Form Factor: 4.4″ (H) x 10.5″ (L), dual slot

Related Products