ClusterMax™ SuperG

GPU Accelerated Supercomputing / AI / Deep Learning Cluster Solution

AMAX’s ClusterMax™ SuperG GPU computing clusters are powered by the NVIDIA® Tesla™ GPU computing platforms, based on NVIDIA’s V100 GPU Computing Accelerator, the world’s fastest, most advanced, and most efficient data center GPUs ever built.

Our cluster solutions are designed to boost throughput and save money for HPC and hyperscale data centers, delivering performance of up to 100 CPUs in a single Tesla V100 GPU, enabling data scientists, researchers, and engineers to tackle challenges that were once thought impossible.

Key features of Tesla V100:

  • Volta Architecture - By pairing CUDA Cores and Tensor Cores within a unified architecture, a single server with Tesla V100 GPUs can replace hundreds of commodity CPU servers for traditional HPC and Deep Learning.
  • Tensor Core - Equipped with 640 Tensor Cores, Tesla V100 delivers 120 TeraFLOPS of deep learning performance. That’s 12X Tensor FLOPS for DL Training, and 6X Tensor FLOPS for DL Inference when compared to NVIDIA Pascal™ GPUs.
  • Next Generation NVLink - NVIDIA NVLink in Tesla V100 delivers 2X higher throughput compared to the previous generation. Up to eight Tesla V100 accelerators can be interconnected at up to 300 GB/s to unleash the highest application performance possible on a single server.
  • Maximum Efficiency Mode - The new maximum efficiency mode allows data centers to achieve up to 40% higher compute capacity per rack within the existing power budget. In this mode, Tesla V100 runs at peak processing efficiency, providing up to 80% of the performance at half the power consumption.
  • HBM2 - With a combination of improved raw bandwidth of 900 GB/s and higher DRAM utilization efficiency at 95%, Tesla V100 delivers 1.5X higher memory bandwidth over Pascal GPUs as measured on STREAM.
  • Programmability - Tesla V100 is architected from the ground up to simplify programmability. It’s new independent thread scheduling enables finer-grain synchronization and improves GPU utilization by sharing resources among small jobs.

Specifications

  • Delivers up to 92,160 Tensor cores, 737,280 CUDA Cores, 1,080+ Teraflops DP, 2,160+ Teraflops SP, and 17,280+ Teraflops Tensor performance per 42U cluster
  • Up to 4,608GB dedicated HBM2 GPU memory
  • Supports dual socket 22-core Intel® Xeon® E5-2600 v4 processor series on host systems
  • Supports FDR/EDR InfiniBand fabric & real time InfiniBand diagnostics
  • Faster communication with InfiniBand using NVIDIA® GPUDirect™ RDMA technology
  • Cluster management and GPU monitoring software, including GPU temperature monitoring, fan speed, and power, providing exclusive access to GPUs in a cluster

Complete Cluster Assembly and Set Up Services:

  • Fully integrated and pre-packaged turnkey HPC solution, including HPC professional services and support, expert installation and setup of rack-optimized cluster nodes, cabling, rails, and other peripherals
  • Configuration of cluster nodes and the network
  • Installation of applications and client computers to offer a comprehensive solution for your IT needs
  • Rapid deployment
  • Server management options include Standards-based IPMI or AMAX remote server management
  • Seamless standard and custom application integration and cluster installation
  • Cluster management options include a choice of commercial and open source software solutions
  • Supports a variety of UPS and PDU configuration and interconnect options, including Infiniband (FDR, EDR), Fibre channel, and Ethernet (Gigabit, 10GbE, 40GbE, 25GbE, 100GbE)
  • Energy efficient cluster cabinets, high performance UPS and power distribution units for expert installation and setup of rack-optimized nodes, cabling, rails, and other peripherals

Rack Level Verification

  • Performance and Benchmark Testing (HPL)
  • ATA rack level stress test
  • Rack Level Serviceability
  • Ease of Deployment Review
  • MPI jobs over IB for HPC
  • GPU stress test using CUDA
  • Cluster management

Large Scale Rack Deployment Review

  • Scalability Process
  • Rack to Rack Connectivity
  • Multi-Cluster Testing
  • Software/Application Load

Optional Cluster System Software Installed:

  • Microsoft Windows Server 2016
  • Bright Computing Cluster Manager
  • SuSE / Red Hat Enterprise Linux
  • C-based software development tools, CUDA 9.x Toolkit and SDK, and various libraries for CPU GPU clusters

GPU Software Development Tools

  • Optional C-based software development tools and various libraries for GPUs
  • CUDA compatible clustering software
  • Deep learning software
  • Model # ClusterMax™ SuperG-V100.14U-4 ClusterMax™ SuperG-V100.24U-8 ClusterMax™ SuperG-V100.42U-16 ClusterMax™ SuperG-V100.42U-36
    Configurations 4x 1U GPU Compute Nodes 8x 1U GPU Compute Nodes 16x 1U GPU Compute Nodes 36x 1U GPU Compute Nodes
    GPU Node CPU Support 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family
    GPU Node Memory Support Up to 512GB DDR4 2666/2400/2133 MHz ECC reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory
    Included GPU / Node 4x Tesla V100, with 16 GPUs per cluster 4x Tesla V100, with 32 GPUs per cluster 4x Tesla V100, with 64 GPUs per cluster 4x Tesla V100, with 144 GPUs per cluster
    Included GPU Memory (32GB per GPU) 512GB 1,024GB 2,304GB 4,608GB
    # of Tensor Cores Included 10,240 20,480 40,960 92,160
    # of GPU Cores Included 81,920 163,840 327,680 737,280
    Double Precision Performance Included 112+ Teraflops 224+ Teraflops 448+ Teraflops 1,008+ Teraflops
    Single Precision Performance Included 224+ Teraflops 448+ Teraflops 896+ Teraflops 2,016+ Teraflops
    Tensor Performance Included 1,792+ Teraflops 3,584+ Teraflops 7,168+ Teraflops 16,128+ Teraflops
    GPU Nodes Interconnectivity InfiniBand InfiniBand InfiniBand InfiniBand
    GPU Node Storage Up to 4x hot-swap 2.5" SATA/SSD Up to 4x hot-swap 2.5" SATA/SSD Up to 4x hot-swap 2.5" SATA/SSD Up to 4x hot-swap 2.5" SATA/SSD
    Master Node 1x 1U Master Node 1x 1U Master Node 1x 1U Master Node 1x 1U Master Node
    Master Node CPU Support 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family 2x Intel® Xeon® Processor Scalable Family
    Master Node Memory Support Up to 512GB DDR4 2666/2400/2133 MHz ECC reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory Up to 512GB DDR4 2666/2400/2133 MHz reg memory
    Master Node Storage 2x 2.5" Hot-swap & 2x 2.5" internal 2x 2.5" Hot-swap & 2x 2.5" internal 2x 2.5" Hot-swap & 2x 2.5" internal 2x 2.5" Hot-swap & 2x 2.5" internal
    Master Node Interconnectivity InfiniBand InfiniBand InfiniBand InfiniBand
    Rack Cabinet 14U 24U 42U 42U
    Network Switch 1x 24-port Gigabit Ethernet 1x 18-Port FDR/EDR InfiniBand 1x 24-port Gigabit Ethernet 1x 18-Port FDR/EDR InfiniBand 1x 24-port Gigabit Ethernet 1x 18 -Port FDR/EDR InfiniBand 1x 48-port Layer 2 Gigabit Ethernet 1x 36-port FDR/EDR (40 Gbps) InfiniBand
    Reference # Q706943 + Q706940 + Q706944 Q706945 + Q706940 + Q706946 Q706947 + Q706940 + Q706948

    Download ClusterMax™ SuperG Tesla® V100 Datasheet

    Copyright © AMAX. All rights reserved. Ultrabook, Celeron, Celeron Inside, Core Inside, Intel, Intel Logo, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Inside Logo, Intel vPro, Itanium, Itanium Inside, Pentium, Pentium Inside, vPro Inside, Xeon, Xeon Phi, and Xeon Inside are trademarks of Intel Corporation in the U.S. and/or other countries.