Hyperscale GPU Accelerated AI Cloud Solution

Key Attributes of HGX-1 Architecture

  • Disaggregated mainstream compute from accelerated compute to provide flexibility of choice
  • Compute requirement: CPUs, GPGPUs, ASICs
  • Bandwidth optimized (GPU-to-Host and GPU-to-GPU peer-to-peer)
  • High Host-to-GPU bandwidth (oversubscription ratio can be 1:1)
  • Designed for scale-up, scale-out based on a Universal Building Block (UBB)
  • Optimized for highly parallel workloads (scale-up within one and up to four chassis, scale-out in clusters of Racks to include ~1000 GPGPUs)
  • Reduce network bandwidth bottleneck via a private PCIe Switching Fabric (GPUDirect, RDMA, GPU-to-IB per Switch)
  • A distributed PCIe Switch Fabric: four six-ported Switch per Chassis (sixteen, 96-Lane PCIe Switches in total)
  • CPU Root Complex as an extension of the PCIe Switch Fabric (Not traversing inter-CPU Link)
  • No bandwidth oversubscription for peer-to-peer (reduced blocking, reduced hop count for inter- and intra-chassis traffic)
  • Multi-tenant support via Virtualization and Physicalization (rightsizing hardware elements for the needs of Virtual Machines)
  • Hardware Isolation through Switch Fabric
  • Networks: front-end (Ethernet datacenter network to connect to public domains to bring data and produce results); back-end (performance domain using InfiniBand, PCIe Switch Fabric, and NVLink)
  • Project Olympus Server as the baseline to benefit from datacenter-readiness: Node Manager, Rack Manager, balanced 3-phase dual-feed Power
  • General-purpose slots: InfiniBand, SmartNIC, M.2 Farm, HBAs, etc.
  • Multi-Chassis interconnect: low-latency PCIe Switch Fabric to scale up to 32 GPGPUs
  • Mezzanine to PCIe Connector to allow other use cases: different Mezzanine, Host, GPGPU


Cloud Artificial Intelligence
Deep Learning Data Analytics
Satellite Imaging Oil and Gas
Climate Modeling Computing Physics
Live Chat