Posted on

NVIDIA Tesla V100 IaaS & PaaS : GPU Public Cloud or On Premises for Deep Learning, Artificial Intelligence & Internet of Things (IoT)?

In a recent blog by IBM’s John Considine, GM, Cloud Infrastructure Services, IBM Watson and Cloud Platform, the company announced that NVIDIA Tesla V100 GPU based computing is now available on IBM Public Cloud infrastructure. This is great news for customers who are looking to get started fast and leverage the benefits of Infrastructure as a Service (IaaS) with instant GPU computing in the cloud and Platform as a Service (PaaS) with instant availability of programming tools without having to set them up. As per NVIDIA’s announcement at Supercomputing 2017, other cloud providers such as Microsoft, Google and Amazon are offering/will offer similar GPU cloud offerings. There is a race for winning in the Artificial Intelligence market and it is not surprising that cloud providers are quickly adopting the big shift in computing as the importance of GPU is growing very rapidly due to it high core and computational capabilities. As per NVIDIA, the Tesla V100 offers the performance of 100 CPUs in a single GPU—enabling data scientists, researchers, and engineers to tackle challenges that were once impossible.

Benefits of GPU based Public Cloud IaaS & PaaS

There are several benefits of GPU based IaaS & PaaS public cloud offerings :

  • Rapid access and provisioning of GPU resources
  • High & elastic scalability – scale up/down as needed
  • Availability of programming tools integrated with Public Cloud API and management tools
  • Subscription/Pay as You Go model
Courtesy : IBM

Above is a snapshot of IBM’s NVIDIA Tesla V100 GPU based cloud server instances that can be ordered and consumed instantly after signing up on IBM’s public cloud.

GPU Based Public Cloud Concerns

Just as with CPU based cloud computing infrastructure, we can anticipate a huge debate on Public vs Private Cloud for GPU infrastructure as well. Below are some of the concerns that I see with  GPU public cloud adoption :

  • Performance – Every application is different and some that do not need real-time GPU processing may be more suitable for public cloud infrastructure. For example, data from Amazon Alexa, Google Assistant, Microsoft Cortana, Apple Siri etc are very suitable for historical analytics and algorithms that run in the cloud. However, there may be applications that cannot tolerate even a litle bit of extra latency; for these applications, on-premises GPU infrastructure would make sense.
  • Data ownership & privacy – GPUs have been used for analysis of very sensitive data about health, defense, cybersecurity etc and there will be concerns on whether or not to use public cloud computing for such data. Some anonymized data will be more friendly to public cloud but the conclusions drawn may not be as suitable.
  • ROI over time – similar to CPU clouds, this is yet to be seen and validated. While it is easy to get started on a public cloud, you need 2-3 years metrics to evaluate whether you are paying more than setting up and operating a private GPU cloud.
  • Security – just like CPU computing, it is arguable whether public cloud is less or more secure than private cloud. The concerns will be more about the type and sensitivity of data that can be breached so IT organizations need to make the right decisions.
  • Control – this may be driven by organizational policies and along with compliance requirements, may also drive where GPU computation needs to happen.

On-Premises GPU Infrastructure Options

SYS-4028GR-TXRT : Optimized for Deep Learning and Big Data Analytics with up to 8 NVIDIA Tesla P100 GPUs










GPUs are very suitable for autonomous cars, molecular dynamics, computational biology, fluid simulation, CAD etc and even for advanced Virtual Desktop Infrastructure (VDI) applications. At Dihuni, we believe that Internet of Things (IoT), Artificial Intelligence (AI), Deep Learning, Machine Learning (ML) and other Big Data and Digital Transformation applications will require to be processed both on-premises and in the cloud depending upon the application and we are suitably positioned to help customers with their GPU server processing requirements. For IoT, we believe GPU processing will be a huge performance driver with wider adoption in 2018.

For on-premises customers, Dihuni offers NVIDIA Tesla V100, P100, P40, P4 and K80 GPU cards that can be purchased directly at our online store and used with compatible Intel or AMD EPYC servers. Or customers have the choice of procuring complete GPU optimized systems such as the Supermicro SYS-4028GR-TVRT with up to 8 NVIDIA Tesla V100 or SYS-4028GR-TXRT with up to 8 NVIDIA Tesla P100 GPUs. These servers support NVIDIA NVLink which is a high-bandwidth, energy-efficient interconnect that enables ultra-fast communication between the CPU and GPU, and between GPUs. Our whole line of GPU optimized servers including from TYAN can be found here.

Customers should decide what is best based for them based on the type of application they want to use GPU power for. In some cases public cloud will make sense and in others private or hybrid (pubic + private) cloud model will work. We always welcome your thoughts; email us at


Pranay Prakash,

Chief Executive Officer, Dihuni

Dihuni offers Digital Transformation Consulting services and products and we are always anxious to learn. We would love to hear from you about your GPU applications. Please e-mail us at or call us. This blog is part of our series related to Digital Transformation/IoT/AI etc that we publish on to benefit our visitors. To contribute original articles that can help advance Digital Transformation, please contact us.