NVIDIA Datasheets

NVIDIA Datasheets
Table of Contents

GPUs

NVIDIA H100 Tensor Core GPU

The NVIDIA H100 Tensor Core GPU stands out for its high performance in AI and HPC applications, offering remarkable scalability and security. It features cutting-edge fourth-generation Tensor Cores and the Transformer Engine, enhancing training and inference speeds. Integral to NVIDIA's data center platform, it accelerates a wide range of applications. Key specifications include impressive teraFLOPS, extensive memory bandwidth, and efficient thermal design.

  • Scalability and Security: Exceptional for diverse workloads.
  • Advanced Acceleration: With fourth-generation Tensor Cores and Transformer Engine.
  • Wide Application: Accelerates over 3000 applications.
  • Technical Specs: High teraFLOPS, memory bandwidth, and optimized thermal design.

NVIDIA GH200 Grace Hopper Superchip

The NVIDIA GH200 Grace Hopper Superchip, a fusion of the NVIDIA Hopper GPU and Grace CPU, epitomizes next-generation AI and HPC technology. This superchip excels in performance and energy efficiency, thanks to its 72-core Grace CPU and robust NVIDIA H100 Tensor Core GPU. Its advanced architecture supports up to 480GB of LPDDR5X memory with ECC and 624GB of fast-access memory, providing substantial power for data-intensive tasks. Featuring 900GB/s NVLink-C2C coherence, it ensures optimal data movement and application performance, making it ideal for large-scale AI and HPC applications.

  • High Performance CPU: 72-core NVIDIA Grace CPU for efficient computing.
  • Advanced GPU: NVIDIA H100 Tensor Core GPU for AI and HPC.
  • Extensive Memory Support: Up to 480GB LPDDR5X memory with ECC and 624GB fast-access memory.
  • Superior Data Coherence: 900GB/s NVLink-C2C for enhanced performance.

NVIDIA L40S GPU

The NVIDIA L40S GPU, leveraging the innovative NVIDIA Ada Lovelace Architecture and advanced Tensor Cores, sets a new standard in AI and graphics processing for data centers. It excels in generative AI, model training, inference, 3D graphics, and video applications, offering exceptional versatility and efficiency for various data center workloads. With features like third-generation RT Cores and a Transformer Engine, it's engineered for continuous enterprise-level operations, driving next-level performance in AI and graphics applications.

  • Powerful Performance: Driven by the NVIDIA Ada Lovelace Architecture and fourth-generation Tensor Cores.
  • Diverse Applications: Suitable for generative AI, large language model training, NVIDIA Omniverse™ Enterprise, rendering, and streaming.
  • Advanced Specifications: Equipped with third-generation RT Cores, Transformer Engine, and designed for 24/7 enterprise operations.

CPU

NVIDIA Grace CPU Superchip

The NVIDIA Grace CPU Superchip redefines data center performance, offering unmatched efficiency and throughput. It's tailored for cloud, HPC, and supercomputing with double the performance per watt of conventional platforms.

  • High-Performance Cores: 144 Arm Neoverse V2 Cores with advanced capabilities.
  • Memory Excellence: Supports up to 960GB of LPDDR5X memory, offering up to 1TB/s bandwidth.
  • NVLink-C2C Coherence: Ensures 900 GB/s memory and I/O coherence.
  • Energy Efficiency: 500W TDP, balancing power, bandwidth, and capacity.

Networking

NVIDIA ConnectX-7 400G Adapters

The NVIDIA ConnectX-7 400G Adapters are designed for high-performance computing, providing unparalleled speeds and connectivity. They integrate RDMA, InfiniBand, and Ethernet technologies, ideal for enterprise, AI, and cloud data centers.

  • High-Speed Networking: Supports up to 400Gb/s for maximum data throughput.
  • Versatile Connectivity: Offers RDMA, InfiniBand, and Ethernet capabilities.
  • Ultra-Low Latency: Ensures rapid data transmission and processing.
  • Enhanced Security: Includes advanced hardware-based security features.
  • ASAP² Technology: Accelerates networking for data-driven applications.

NVIDIA Quantum-X800 InfiniBand Platform

The NVIDIA Quantum-X800 InfiniBand platform offers a high-performance solution for the next generation of AI infrastructure, particularly optimized for trillion-parameter GPU computing. It represents a significant advancement in networking technology with the capability to handle intensive AI workloads and high-performance computing (HPC) demands efficiently.

Key Highlights

  • Advanced In-Network Computing: Features cutting-edge technologies such as NVIDIA SHARPv4, MPI tag matching, and programmable cores that significantly boost in-network computing capabilities.
  • Ultra-High Throughput and Low Latency: The Quantum-X800 platform provides 800 Gb/s of end-to-end connectivity with ultra-low latency, essential for training and deploying trillion-parameter-scale AI models.
  • Enhanced Network Resilience and Efficiency: Includes adaptive routing and telemetry-based congestion control to maximize bandwidth and ensure network resilience. The self-healing interconnect enhances the reliability of the InfiniBand network.
  • Industry-Leading Scalability: The platform supports the NVIDIA Quantum Q3400 switch, which enables 2X faster speeds and 5X higher scalability for AI compute fabrics. The Q3400 switch can connect up to 10,368 network interface cards (NICs) with minimal latency.
  • Comprehensive Management and Security Features: Equipped with NVIDIA's Unified Fabric Manager (UFM) platform, which includes a comprehensive suite of management tools to optimize, monitor, and secure the InfiniBand data center fabric.

NVIDIA Solutions

NVIDIA DGX H100 System

The NVIDIA DGX H100 system is engineered to accelerate the pace of AI and HPC innovation, offering unparalleled computational power and efficiency. It integrates the groundbreaking capabilities of the NVIDIA H100 Tensor Core GPU, making it ideal for the most demanding AI research and big data analytics workloads. This advanced system is designed to handle complex AI models and massive datasets with ease, providing researchers and data scientists with the tools they need to push the boundaries of what's possible in AI and HPC applications.

  • High-Performance Computing: Leveraging the NVIDIA H100 Tensor Core GPU, the DGX H100 system delivers exceptional performance for AI training and inference tasks, enabling faster time to insights.
  • Advanced AI Capabilities: With support for the latest AI frameworks and libraries, this system accelerates the development and deployment of AI models, from natural language processing to computer vision.
  • Scalable Architecture: The DGX H100 system's architecture is built for scalability, allowing seamless integration into data center environments for expanded computational capacity.
  • Energy Efficiency: Designed with energy efficiency in mind, the system ensures optimal performance per watt, reducing the total cost of ownership for AI and HPC infrastructure.

NVIDIA HGX H100 System

The NVIDIA HGX H100 GPU is described as the most powerful AI supercomputing platform, specifically designed for AI, simulation, and data analytics. This platform integrates NVIDIA GPUs, NVLink, networking, and an optimized software stack from the NGC catalog, offering unparalleled application performance. It's built to support the growing demands of AI, complex simulations, and large datasets by enabling efficient, accelerated computing across multiple GPUs. With its advanced features and capabilities, the NVIDIA HGX H100 facilitates service providers, researchers, and scientists in delivering AI, simulation, and data analytics for rapid insights.

  • End-to-End Accelerated Computing: Combining H100 Tensor Core GPUs with high-speed interconnects, delivering up to 32 petaFLOPS of computing power, making it the world's most powerful accelerated server platform for AI and HPC.
  • Advanced Networking Options: Features speeds up to 400 Gb/s with NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet, alongside NVIDIA BlueField-3 DPUs for enhanced networking and security.
  • Exceptional Training Performance and Scalability: Incorporates the Transformer Engine with FP8 precision for up to 4X faster training over the previous generation, alongside infrastructure that supports scalable, efficient GPU clusters.
  • Key Technological Innovations: Includes the Transformer Engine, fourth-generation NVLink, Confidential Computing, Multi-Instance GPU (MIG), and DPX instructions, highlighting its cutting-edge capabilities for demanding AI and HPC tasks.

NVIDIA H200 Tensor Core GPU Overview

Key Specifications

  • GPU Architecture: NVIDIA Hopper™
  • GPU Memory: 141GB HBM3e, delivering 4.8TB/s bandwidth
  • Performance: 4 petaFLOPS of FP8, showcasing advanced capabilities in generative AI and scientific computing

Performance Enhancements

  • Inference Performance: Doubles the inference throughput of the H100 when processing large models like Llama2 70B
  • HPC Capability: Achieves up to 110X performance improvement in high-performance computing tasks

Memory and Efficiency

  • Memory Bandwidth: Essential for memory-intensive applications, supporting efficient data manipulation and faster results
  • Energy and Cost Efficiency: Maintains competitive performance within the same power profile as the H100, reducing total cost of ownership

Technical Specifications

  • Floating Point Operations: Multiple configurations ranging from FP64 at 34 TFLOPS to FP8 at 3,958 TFLOPS
  • Form Factor: SXM1 with options for up to 7 Multi-Instance GPUs
  • Maximum Power: Configurable up to 700W
  • Interconnect: Features NVIDIA NVLink® at 900GB/s and PCIe Gen5 at 128GB/s

NVIDIA DGX BasePOD for Healthcare and Life Sciences

It focuses on streamlining AI development and deployment in the healthcare sector, emphasizing the importance of a robust infrastructure foundation for AI applications. Three key features highlighted are:

  1. Prescriptive Architecture: Simplifies design and eliminates complexity, facilitating faster deployment.
  2. Predictable Performance at Scale: Ensures consistent, reliable performance even as demands increase.
  3. Proven Software Stack: Optimized specifically for health and life science applications, ensuring compatibility and efficiency in development workflows.

NVIDIA DGX SuperPOD

The NVIDIA DGX SuperPOD is a comprehensive, turnkey data center solution designed for enterprise AI applications. It integrates leadership-class infrastructure with scalable performance for demanding AI and high-performance computing (HPC) workloads, delivering a full-service experience that accelerates deployment from months to weeks. Key features include:

  1. Optimized Multi-Node Performance: Addresses the challenge of scaling inter-GPU communications for multi-node AI infrastructure, ensuring high levels of performance.
  2. NVIDIA Base Command Integration: Empowers organizations with NVIDIA's software innovations for streamlined AI development and deployment.
  3. Comprehensive Lifecycle Services: Offers full lifecycle professional services from installation to management, optimizing the IT environment for AI workloads.

NVIDIA DGX BasePOD for Energy

The NVIDIA DGX BasePOD for Energy solution overview focuses on enhancing AI development and deployment within the energy sector to support a more sustainable future through accelerated computing. This solution helps energy companies innovate in areas like oil, gas, power, and utilities by leveraging AI and high-performance computing for tasks such as reservoir simulation, seismic processing, and optimizing well planning. Key features include:

  1. Integrated Software and Tools: Utilizes the NVIDIA AI Enterprise software suite for developing and deploying AI applications tailored to energy workloads.
  2. Proven Performance and Scalability: Combines NVIDIA's proven DGX BasePOD architecture with industry-specific optimizations.
  3. Support for Advanced Energy Workloads: Supports complex energy tasks like grid simulation and large language models for enhanced asset management and operational efficiency.

How Can Generative AI Transform the Public Sector?

The document outlines how Generative AI can transform the public sector by enhancing citizen services and mission outcomes through streamlined deployment of foundation models and accelerated infrastructure. Key points include:

  1. Generative AI Adoption: Boosts cost savings and efficiency across public sector use cases such as modernizing humanitarian assistance, improving public health, and powering multilingual chatbots.
  2. Foundation Models and Customization: Allows public sector agencies to start with a foundation model, customize it with proprietary data, and leverage Retrieval-augmented generation (RAG) for secure, informed responses.
  3. Flexible Deployment and Scalability: Supports deployment in private clouds and on-premises infrastructure to protect sensitive data, with NVIDIA AI providing a complete platform for adopting and accelerating generative AI.

The NVIDIA Quantum-X800 InfiniBand Platform

This platform is optimized to support the deployment and training of AI models at the trillion-parameter scale, providing 800 gigabits per second of connectivity with ultra-low latency. Its key features include the integration of innovative In-Network Computing technologies, adaptive routing for maximum bandwidth utilization, and enhancements in network resilience and efficiency. Here are the main highlights:

  • Advanced In-Network Computing: Utilizes NVIDIA SHARP v4 and programmable cores for enhanced computing within the network, accelerating AI and HPC workloads.
  • Ultra-High Connectivity and Scalability: Features the NVIDIA Quantum-X800 Q3400 InfiniBand switch, supporting 2X faster speeds and 5X higher scalability for AI compute fabrics.
  • NVIDIA ConnectX-8 SuperNIC: Delivers 800G connectivity with advanced offload and quality-of-service enhancements, designed for efficient management of generative AI clouds.

NVIDIA DGX B200

The NVIDIA DGX B200 datasheet provides a comprehensive overview of NVIDIA's unified AI platform designed for training, fine-tuning, and inference, catering to the ever-growing needs of AI workloads in businesses. The DGX B200 is powered by eight NVIDIA Blackwell GPUs, boasting 1.4TB of GPU memory and 64TB/s of memory bandwidth, setting a new standard for generative AI performance in enterprise AI workloads. It's designed as a universal AI supercomputer, aimed at accelerating the development-to-deployment pipeline for AI applications across various industries.

Key highlights from the datasheet include:

  • The DGX B200 features eight NVIDIA Blackwell GPUs, offering 1.4TB of GPU memory, which is ideal for handling extensive enterprise AI workloads.
  • It delivers 72 petaFLOPS of training performance and 144 petaFLOPS of inference performance, showcasing its capability to tackle the most demanding AI tasks.
  • The platform includes NVIDIA networking, dual 5th generation Intel® Xeon® Scalable Processors, and is the foundation of NVIDIA DGX BasePOD and NVIDIA DGX SuperPOD, enhancing its versatility and power in AI infrastructure.
  • Comes with NVIDIA AI Enterprise and NVIDIA Base Command™ software, providing a fully optimized hardware and software platform for AI development and deployment.

NVIDIA DGX SuperPOD with DGX B200 Systems

The NVIDIA DGX SuperPOD equipped with DGX B200 systems offers an integrated solution for enterprise-scale AI challenges. Designed as a powerful configuration of NVIDIA's cutting-edge technology, the DGX SuperPOD harnesses the strength of DGX B200 systems, featuring eight NVIDIA Blackwell GPUs each. This setup delivers an extraordinary 1.4TB of GPU memory per system and 64TB/s of memory bandwidth, ensuring unparalleled generative AI performance across extensive enterprise AI workloads.

Key Highlights

  • Extensive GPU Memory and Bandwidth: Each DGX B200 system within the SuperPOD offers 1.4TB of GPU memory and 64TB/s memory bandwidth, ideal for the most demanding AI applications.
  • High-Performance Computing: The DGX SuperPOD achieves 72 petaFLOPS of training performance and 144 petaFLOPS of inference performance per DGX B200 unit, showcasing robust capability to tackle complex AI tasks.
  • Integrated NVIDIA Technology: Includes advanced NVIDIA networking and is powered by dual 5th generation Intel® Xeon® Scalable Processors. The SuperPOD configuration serves as a foundational component for large-scale AI infrastructure.
  • Optimized Software Platform: Comes equipped with NVIDIA AI Enterprise and NVIDIA Base Command™ software, providing a fully optimized hardware and software platform that enhances AI development and deployment at scale.

The NVIDIA DGX SuperPOD with DGX B200 systems is designed to streamline and accelerate the AI pipeline from development to deployment, making it an essential asset for industries looking to leverage high-performance AI computing to drive innovation and efficiency.

NVIDIA DGX Platform

Unprecedented Performance, Predictable Cost

The DGX platform stands out for its energy-efficient accelerated computing that adapts to your business model. Since its inception in 2016, the DGX has set the standard for AI performance, achieving numerous records in supercomputer performance and energy efficiency. The platform offers a seamless experience across clouds and data centers, blending integrated development and workflow management software into a consolidated interface. This ensures a predictable cost model for AI development infrastructure, regardless of the deployment model.

Comprehensive AI Platform

  • NVIDIA DGX Cloud: Designed for enterprise developers, this serverless AI platform offers a full-stack solution that maximizes efficiency and capacity for generative AI, supported by NVIDIA's latest technologies and direct access to AI experts.
  • NVIDIA Base Command: This software powers the DGX platform, enhancing enterprise AI with robust orchestration, cluster management, and optimized libraries for compute, storage, and networking.
  • DGX Software Stack: A fully supported suite that includes DGX OS Extensions for Linux, cluster management tools, and job scheduling & orchestration capabilities to streamline AI development and ensure high system reliability.

AI Hardware and Infrastructure

  • NVIDIA DGX SuperPOD: A leader in AI data center infrastructure, delivering scalable performance for complex AI workloads with proven results.
  • NVIDIA DGX BasePOD: Provides the foundation for business transformation and the development of AI applications across industries.
  • NVIDIA DGX B200: A unified platform for training, fine-tuning, and inference, offering significant performance improvements over previous generations.
  • NVIDIA DGX H100: Empowers enterprises to push the boundaries of business innovation with the groundbreaking NVIDIA H100 Tensor Core GPU.

NVIDIA Quantum-X800 InfiniBand Platform

The document provides a detailed overview of the NVIDIA Quantum-X800 InfiniBand Platform, designed for high-performance AI workloads and data center innovations. This next-generation platform features cutting-edge networking capabilities and hardware advancements to support large-scale AI model training and deployment. Here are some key highlights:

  • Ultra-high Speeds: Offers 800 Gb/s end-to-end connectivity and ultra-low latency, essential for training and deploying AI models at the trillion-parameter scale.
  • Advanced In-Network Computing: Incorporates NVIDIA SHARP™ v4, adaptive routing, and telemetry-based congestion control to maximize performance and efficiency.
  • Robust Network Components: Includes the NVIDIA ConnectX-8 SuperNIC and Q3400 InfiniBand switch, which provide enhanced offload capabilities and support up to 10,368 network interface cards.
  • Enhanced Data Center Management: Features the Unified Fabric Manager (UFM) platform for effective network setup, management, and proactive diagnostics in InfiniBand data centers.