Take remote productivity to the next level with the NVIDIA® A16, a GPU purpose-built for high-density, graphics-rich Virtual Desktop Infrastructure (VDI). Engineered on the latest NVIDIA Ampere architecture, the A16 is optimized to deliver up to 64 concurrent users per board in a compact dual-slot design doubling the user density compared to its predecessor.
When paired with NVIDIA Virtual PC (vPC) software, the A16 offers the power and responsiveness needed to handle demanding workloads from anywhere. Its architecture is fine-tuned to meet the evolving demands of hybrid work, offering 4x the encoder throughput of the NVIDIA T4 and seamless support for diverse user profiles all on a single board.
| · CUDA Cores: 5120 (4 × 1280) |
| · Tensor Cores: 160 (4 × 40) |
| · RT Cores: 40 (4 × 10) |
| · GPU Memory: 64 GB GDDR6 with ECC (16 GB per GPU) |
| · Memory Bandwidth: 4 × 232 GB/s |
| · Max Power Consumption: 250 W |
| · Interconnect: PCI Express Gen 4.0 x16 |
| · Cooling: Passive |
| · vGPU Software Support: |
| o NVIDIA Virtual PC (vPC) |
| o NVIDIA Virtual Applications (vApps) |
| o NVIDIA RTX Virtual Workstation (vWS) |
| o NVIDIA Virtual Compute Server (vCS) |
| · vGPU Profiles: Refer to the Virtual GPU Licensing Guide |
| · NVENC / NVDEC: 4x encoders / 8x decoders (AV1 Decode supported) |
| · Secure Boot: Yes – Hardware Root of Trust |
| · NEBS Compliance: Level 3 |
The NVIDIA A16 is designed with a unique quad-GPU board architecture, allowing it to maximize user density within a single server chassis. By combining this hardware design with NVIDIA Virtual PC (vPC) software, organizations can deliver full-featured virtual desktops complete with GPU acceleration to dozens of users simultaneously. This not only optimizes server space and energy use but also reduces operational costs while maintaining high performance.
Unlike CPU-only VDI solutions, the A16 leverages dedicated GPU resources to offer a responsive, high-frame-rate experience that feels virtually indistinguishable from a physical PC. Users benefit from smooth 2D/3D application performance, faster load times, and reduced input latency. This is especially critical for professionals who use graphically demanding tools such as CAD software, content creation platforms, and video conferencing applications.
The A16 doubles the user density compared to previous-generation GPUs like the NVIDIA M10. With support for up to 64 concurrent users per board, it enables IT teams to serve more virtual desktops per server, improving scalability and reducing total cost of ownership (TCO). This makes it an ideal solution for large enterprises, educational institutions, and remote workforces that require efficient and cost-effective VDI deployments.
Modern professionals often rely on multi-monitor setups for productivity. The A16 supports up to two 4K displays or one 5K monitor per user, providing crystal-clear visuals and ample screen real estate. This makes it perfect for use cases like financial trading, engineering, video editing, and medical imaging where resolution and clarity are critical for decision-making and workflow efficiency.
With more than double the encoding throughput of its predecessor (the M10), the A16 delivers exceptional multi-user performance for streaming video, conferencing, and multimedia applications. Whether users are sharing screens, watching videos, or attending virtual meetings, the A16 ensures minimal buffering, reduced compression artifacts, and lower CPU utilization, freeing up system resources for other tasks.
A16 is equipped with state-of-the-art support for modern video codecs, including H.265 (HEVC) for efficient high-quality streaming, VP9 for web-based video, and AV1 decode, which is rapidly becoming the industry standard for 4K and 8K streaming. These codecs ensure that video playback and conferencing run efficiently even on low-bandwidth connections without compromising image quality.
The A16 utilizes PCI Express Gen 4.0 interface, doubling the bandwidth of PCIe Gen 3.0. This high-speed interconnect ensures faster communication between the GPU and CPU/memory, making it highly effective for data-heavy VDI workloads such as 3D rendering, AI-assisted tools, and real-time simulation.
As part of the Ampere family, the A16 integrates 3rd-generation Tensor Cores, 2nd-generation RT Cores, and highly efficient CUDA Cores, enabling the flexibility to run both graphics-intensive applications and compute workloads. Whether you’re deploying NVIDIA RTX Virtual Workstation (vWS) for design professionals or Virtual Compute Server (vCS) for data processing, the A16 adapts dynamically to your infrastructure needs.
Built to meet the highest standards of uptime, stability, and performance in mission-critical environments
The NVIDIA A16 has been engineered from the ground up to meet the rigorous demands of enterprise data centers and cloud infrastructure, where continuous 24/7 operation is non-negotiable. Its reliability goes far beyond basic hardware stability it’s a full-stack solution designed to deliver long-term performance, compatibility, and maintainability.
The A16 is built using data center-class components that are tested for durability, thermal stability, and power efficiency. Unlike consumer GPUs, the A16 is intended to operate under sustained workloads in dense server environments, maintaining optimal performance over extended periods without thermal throttling or performance degradation.
With a maximum power draw of just 250W across four GPUs, the A16 delivers exceptional performance-per-watt. This makes it highly suitable for organizations looking to optimize energy usage and reduce operational costs in high-scale VDI deployments. Passive cooling also allows for dense configurations without relying on high-speed fans or active cooling units, improving overall reliability.
Each A16 unit is rigorously tested and validated by NVIDIA and its OEM partners (such as Dell, HPE, Lenovo, and Supermicro) across a variety of server configurations and real-world use cases. This ensures compatibility with leading virtualization platforms such as VMware vSphere, Citrix XenServer, and Microsoft Hyper-V, reducing the risk of integration issues and support downtime.
The A16 supports a wide range of virtual GPU (vGPU) software stacks, including:
This flexibility allows IT administrators to dynamically allocate GPU resources based on user needs, all within a stable and scalable infrastructure.
NVIDIA maintains close partnerships with ISVs (Independent Software Vendors) to ensure that the A16 is certified and optimized for leading professional applications across industries such as Autodesk, Adobe, Siemens, Dassault Systèmes, and more. It also supports the latest APIs, including:
This guarantees a consistent and error-free experience, even in complex virtualized environments.
The A16 supports secure boot features at the hardware level, which ensures that only authenticated firmware and software are executed on the device. This is especially critical in enterprise and government environments, where data integrity and system security are top priorities.
In summary, the NVIDIA A16 isn’t just a GPU it’s a scalable, secure, and future-ready platform built for enterprise IT infrastructures that demand maximum uptime, long lifecycle support, and complete confidence in performance and compatibility.
| · CUDA Cores: 5120 (4 × 1280) |
| · Tensor Cores: 160 (4 × 40) |
| · RT Cores: 40 (4 × 10) |
| · GPU Memory: 64 GB GDDR6 with ECC (16 GB per GPU) |
| · Memory Bandwidth: 4 × 232 GB/s |
| · Max Power Consumption: 250 W |
| · Interconnect: PCI Express Gen 4.0 x16 |
| · Cooling: Passive |
| · vGPU Software Support: |
| o NVIDIA Virtual PC (vPC) |
| o NVIDIA Virtual Applications (vApps) |
| o NVIDIA RTX Virtual Workstation (vWS) |
| o NVIDIA Virtual Compute Server (vCS) |
| · vGPU Profiles: Refer to the Virtual GPU Licensing Guide |
| · NVENC / NVDEC: 4x encoders / 8x decoders (AV1 Decode supported) |
| · Secure Boot: Yes – Hardware Root of Trust |
| · NEBS Compliance: Level 3 |
Purpose‑Built for High‑Density VDI
Quad-GPU board design based on Ampere architecture, optimized for graphics-rich virtual desktop infrastructure — supporting up to 64 users per dual-slot card .
Accelerated Virtual PC & Workstation Experience
When paired with NVIDIA vPC or RTX vWS software, delivers responsive, native-like performance for productivity apps, CAD, and video editing.
Superior Media Encoding & Decoding
Features 4× NVENC encoders and 8× NVDEC decoders (including AV1 support), delivering over double the throughput compared to previous-generation M10 — ideal for streaming, conferencing, and multimedia workflows.
High-Resolution Multi-Monitor Support
Drives multiple high-res displays—up to two 4K or a single 5K—per board, perfect for professional visualization and productivity.
Performance & Efficiency with Ampere Cores
Built on Ampere architecture, includes CUDA, 2nd-gen RT, and 3rd-gen Tensor Cores — enabling graphics workloads and compute acceleration alongside VDI tasks.
PCIe Gen 4 Connectivity
Utilizes PCI Express Gen 4 x16 interface for high-bandwidth, low-latency data transfer between CPU memory and GPUs.
Enterprise-Grade, Passive Dual-Slot Design
Features a dual-slot, full-height/full-length passive-cooled form factor, draws only 250 W, and is NEBS-ready for reliable 24/7 data center deployment.
Secure and Scalable Virtualization
Supports hardware root-of-trust secure boot and a broad NVIDIA virtualization software stack — including vPC, vApps, vWS, vCS, and AI Enterprise — enabling scalable, secure multi-tenant environments.
Outstanding Density and Cost Efficiency
Doubles user density over prior-gen cards, offering exceptional performance-per-dollar and lower TCO for enterprise virtual desktop deployments.
Discover the countless ways that Q9 technology can solve your network challenges and transform your business – with a free 30-minute discovery call.
At Q9, we have the skills, the experience, and the passion to help you achieve your business goals and transform your organization.
All rights reserved for Q9 technologies.