AceleMax DGS-428AS

4U Dual AMD EPYC Processor 8x NVIDIA A100 SXM4 GPU Server

  • High density 4U System with eight NVIDIA HGX A100 SXM4 GPUs
  • High GPU Peer to Peer Communication via NVIDIA NVLINK
  • Less latency with Direct Attached GPUs & Next Gen PCIe 4.0 Support
  • Eight NIC for GPU Direct Attach RDMA & 4 NVMe GPU Direct Storage
  • Supports two AMD EPYC™ 7002 or 7003 series processors family

Request a Quote

Reference # 4B28330

The AceleMax™ DGS-428AS is a high density 4U system with eight NVIDIA® HGX™ A100 40GB or 80GB SXM4 GPUs, and support for high GPU Peer to Peer Communication via NVIDIA NVLINK, Direct Attached GPUs & Next Gen PCIe 4.0, dual AMD EPYC 7002 or 7003 series processors, eight NIC for GPU Direct Attach RDMA, and 4 NVMe GPU Direct storage, bringing huge parallel computing power to customers, thereby helping customers accelerate their digital transformation.

NVIDIA A100 SXM for HGX

The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration and flexibility to power the world’s highest-performing elastic data centers for AI, data analytics and HPC applications. As the engine of the NVIDIA data center platform, the A100 GPU provides up to 20X higher performance and 2.5X AI performance than V100 GPUs, and can efficiently scale up to thousands of GPUs or be partitioned into seven isolated GPU instances with new multi-Instance GPU (MIG) capability to accelerate workloads of all sizes.

 

The NVIDIA A100 GPU features third-generation Tensor Core technology that supports a broad range of math precisions providing a unified workload accelerator for data analytics, AI training, AI inference, and HPC. It also supports new features such as New Multi-Instance GPU, delivering optimal utilization with right sized GPU and 7x Simultaneous Instances per GPU; New Sparsity Acceleration, harnessing Sparsity in AI Models with 2x AI Performance; 3rd Generation NVLINK and NVSWITCH, delivering Efficient Scaling to Enable Super GPU, and 2X More Bandwidth than the V100 GPU. Accelerating both scale-up and scale-out workloads on one platform enables elastic data centers that can dynamically adjust to shifting application workload demands. This simultaneously boosts throughput and drives down the cost of data centers.

 

Combined with the NVIDIA software stack, the A100 GPU accelerates all major deep learning and data analytics frameworks and over 700 HPC applications. NVIDIA NGC, a hub for GPU-optimized software containers for AI and HPC, simplifies application deployments so researchers and developers can focus on building their solutions.

 

Applications:

AI, HPC, VDI, machine intelligence, deep learning, machine learning, artificial intelligence, Neural Network, advanced rendering and compute.

S Y S T E M

4U Rackmount

P R O C E S S O R S

Dual AMD EPYC™ 7002 or 7003 Processors

G P U

NVIDIA HGX A100 8-GPUs 40GB

M E M O R Y

32 DIMM slots, up to 8TB DDR4 memory 3200 MHz DIMMs

D R I V E S

6x U.2 NVMe (4x – PCIe switch & 2x – CPU) 2x M.2 NVMe

I / O

8x PCIe 4.0 x16 from PCI3 switch 2x PCIe 4.0 x16 LP from CPUs AIOM Support

C O O L I N G  F A N S

4x Removable heavy duty fans

P O W E R  S U P P L I E S

3000W Redundant Power Supplies Titanium Level

Processor

Dual AMD EPYC™ 7002 or 7003 series processor, 7nm, Socket SP3, up to 64 cores, 128 threads, and 256MB L3 cache TDP up to 280W

Memory

  • 32x DDR4 DIMM slots
  • 8-Channel memory architecture
  • Up to 8TB RDIMM/LRDIMM DDR4-3200 memory

Graphics Processing Unit (GPU):

  • Supports 8 NVIDIA A100 40GB or 80GB SXM4 GPUs
  • Supports PCIe Gen4: 64 GB/sec Third generation NVIDIA® NVLink® 600 GB/sec interconnect interface
  • Up to 7 Multi-Instance GPU (MIG) instances
  • Delivers 100% performance for Top applications
  • Up to 55,296 FP32 CUDA Cores, 27,648 FP64 CUDA Cores, 3,456 Tensor Cores, 77.60 TF peak FP64 double-precision performance, 156 TF peak FP64 Tensor Core double-precision performance, 156 TF peak FP32 single-precision performance, 2,496 TF peak Bfloat16 performance, 2,496 TF peak FP16 Tensor Core half-precision performance, 9,984 TOPS peak Int8 Tensor Core Inference performance, and 320GB GPU memory, with eight A100 SXM4 GPUs in a 4U chassis
  • On-board Aspeed AST2600 graphics controller

Expansion Slots

  • 8x PCIe 4.0 x16 from PCIe switch
  • 2x PCIe 4.0 x16 LP from CPUs

Server Management

  • Support for Intelligent Platform Management Interface v.2.0
  • IPMI 2.0 with virtual media over LAN and KVM-over-LAN support

Storage

  • 6x U.2 NVMe bays (4 –PCIe switch & 2 –CPU)
  • 2x M.2 NVMe bays

Network Controller

  • Dual RJ45 10GbE-aggregate host LAN
  • 1x GbE management LAN

Power Supply

3000W Titanium level Redundant Power Supplies with PMBus

System Dimension

7.0″ x 17.2″ x 29″ / 178mm x 437mm x 737mm (H x W x D)

Optimized for Turnkey Solutions

Enable powerful design, training, and visualization with built-in software tools including TensorFlow, Caffe, Torch, Theano, BIDMach cuDNN, NVIDIA CUDA Toolkit and NVIDIA DIGITS.