Sale!

Lenovo 7C57A02892 NVIDIA Tesla P4 GPU 8GB DDR5 Pascal CUDA PCIe x16 for Inference Acceleration Deep Learning & Artificial Intelligence

$3,423.00

(Add to cart to Buy / Request Quote)

Out of stock

Safe Checkout

Request Formal Quote, Volume Pricing, Stock or Product Information

  • Competitor Match/Beat on Custom Servers and Select Products (send competitor quote)
  • Leasing Options Available (requires 5 years business operations)
  • Purchase Orders Accepted / Net Terms subject to approval
  • Custom Servers - Configure Below, Add to Cart and Request Quote for formal pricing

The ThinkSystemThe NVIDIA Tesla P4 is powered by the revolutionary NVIDIA Pascal architecture and purpose-built to boost efficiency for scale-out servers running deep learning workloads, enabling smart responsive AI based services. It slashes inference latency by 15X in any hyperscale infrastructure and provides an incredible 60X better energy efficiency than CPUs. This unlocks a new wave of AI services previous impossible due to latency limitations.

The NVIDIA Tesla P4 is a single-slot, low profile, 6.6 inch PCI Express Gen3 GPU Accelerator with an NVIDIA Pascal graphics processing unit (GPU). The Tesla P4 has 8 GB GDDR5 memory and a 75 W maximum power limit. The NVIDIA Tesla P4 features optimized INT8 instructions aimed at deep learning inference computations. As a result, the NVIDIA Tesla P4 delivers 22 TOPs (Tera-Operations per second) of inference performance, enabling smart responsive artificial intelligence (AI)-based services.

 

 

As a Lenovo Partner and NVIDIA Preferred Solution Provider, we are authorized by the manufacturer and proudly deliver only original factory packaged products.

Key Features

  • Designed  specifically for Lenovo servers, sold as Lenovo part number and supported by Lenovo
  • Small form-factor, 50/75-Watt design
  • Passively cooled board
  • 8 GB GDDR5 memory
  • INT8 operations slash latency by 15X.
  • Delivers 21 TOPs (TeraOperations per second) of inference performance
  • Hardware-decode engine capable of transcoding and inferencing 35 HD video streams in real time.
  • Manufacturer’s Part Number: 7C57A02892

The NVIDIA® Tesla® P4 is a single-slot, low profile, 6.6 inch PCI Express Gen3 GPU Accelerator with an NVIDIA® Pascal™ graphics processing unit (GPU). The Tesla P4 has 8 GB GDDR5 memory and a 75 W maximum power limit. The Tesla P4 is offered as a 75 W or 50 W passively cooled board that requires system air flow to properly operate the card within thermal limits. The NVIDIA Tesla P4 features optimized INT8 instructions aimed at deep learning inference computations. As a result, the NVIDIA Tesla P4 delivers 21 TOPs (TeraOperations per second) of inference performance, enabling smart responsive artificial intelligence (AI)-based services. For performance optimization this board utilizes NVIDIA GPU Boost™, which will dynamically adjust the GPU clock to maximize performance within thermal limits.

Responsive Experience with Real-Time Inference

Responsiveness is key to user engagement for services such as interactive speech, visual search, Internet of Things (IoT) and video recommendations. As models increase in accuracy and complexity, CPUs are no longer capable of delivering a responsive user experience. The Tesla P4 delivers 22 TOPs of inference performance with INT8 operations

100X Higher Throughput to Keep Up with Expanding Data

50x Higher Throughput to Keep Up with Expanding Workloads

The volume of data generated every day in the form of sensor logs, images, videos, and records is economically impractical to process on CPUs. Volta-powered Tesla V100 GPUs give data centers a dramatic boost in throughput for deep learning workloads to extract intelligence from this tsunami of data. A server with single Tesla V100 can replace up to 50 CPU-only servers for deep learning inference workloads, so you get dramatically higher throughput with lower acquisition cost.

A Dedicated Decode Engine for New AI-based Video Services

A Dedicated Decode Engine for New AI-based Video Services

The Tesla P4 GPU can analyze up to 39 HD video streams in real time, powered by a dedicated hardware-accelerated decode engine that works in parallel with the NVIDIA® CUDA® cores performing inference. By integrating deep learning into the video pipeline, customers can offer new levels of smart, innovative video services that facilitate video search and other video-related services.

Unprecedented Efficiency for Low-Power Scale-out Servers

Unprecedented Efficiency for Low-Power Scale-out Servers

The ultra-efficient Tesla P4 GPU accelerates density-optimized scale-out servers with a small form factor and
50/75 W power footprint design. It delivers an incredible 52X better energy efficiency than CPUs for deep learning inference workloads so that hyperscale customers can scale within their existing infrastructure and service the exponential growth in demand for AI-based applications.

Faster Deployment With NVIDIA TensorRT™ and DeepStream SDK

Faster Deployment With NVIDIA TensorRT™ and DeepStream SDK

NVIDIA TensorRT is a high-performance neural network inference engine for production deployment of deep learning applications. It includes libraries to streamline deep learning models for production deployment, taking trained neural nets—usually in 32-bit or 16-bit data—and optimizing them for reduced-precision INT8 operations on Tesla P4, or FP16 on Tesla V100. NVIDIA DeepStream SDK taps into the power of Tesla GPUs to simultaneously decode and analyze video streams.

CUDA Ready

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

In GPU-accelerated applications, the sequential part of the workload runs on the CPU – which is optimized for single-threaded performance – while the compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords.

The CUDA Toolkit from NVIDIA provides everything you need to develop GPU-accelerated applications. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, development tools and the CUDA runtime.

Performance Specifications for NVIDIA Tesla P4, P40 and V100 Accelerators

Tesla V100: The Universal Datacenter GPU Tesla P4 for Ultra-Efficient Scale-Out Servers Tesla P40 for Inference Throughput Servers
Single-Precision Performance (FP32) 14 teraflops (PCIe)
15.7 teraflops (SXM2)
5.5 teraflops 12 teraflops
Half-Precision Performance (FP16) 112 teraflops (PCIe)
125 teraflops (SXM2)
Integer Operations (INT8) 22 TOPS* 47 TOPS*
GPU Memory 16 GB HBM2 8 GB 24 GB
Memory Bandwidth 900 GB/s 192 GB/s 346 GB/s
System Interface/Form Factor Dual-Slot, Full-Height PCI Express Form Factor
SXM2 / NVLink
Low-Profile PCI Express Form Factor Dual-Slot, Full-Height PCI Express Form Factor
Power 250W (PCIe)
300W (SXM2)
50 W/75 W 250 W
Hardware-Accelerated Video Engine 1x Decode Engine, 2x Encode Engines 1x Decode Engine, 2x Encode Engines

*Tera-Operations per Second with Boost Clock Enabled

Supported Lenovo Servers

Weight 8 lbs
Dimensions 10.7 × 4.4 × 1 in

Reviews

There are no reviews yet.

Be the first to review “Lenovo 7C57A02892 NVIDIA Tesla P4 GPU 8GB DDR5 Pascal CUDA PCIe x16 for Inference Acceleration Deep Learning & Artificial Intelligence”

Your email address will not be published. Required fields are marked *

Shopping Cart
Scroll to Top