The NVIDIA® H100 NVL Tensor Core GPU is the most optimized platform for LLM Inferences with its high compute density, high memory bandwidth, high energy efficiency, and unique NVLink architecture. It also delivers unprecedented acceleration to power the world’s highest-performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. NVIDIA H100 NVL Tensor Core technology supports a broad range of math precisions, providing a single accelerator for every compute workload. The NVIDIA H100 NVL supports double precision (FP64), singleprecision (FP32), half precision (FP16), 8-bit floating point (FP8), and integer (INT8) compute tasks.
Note – this GPU must be installed in compatible servers. Below are popular H100 NVL servers:
8 GPU PCI-e H100 NVL Server (EPYC)