![](https://static.wixstatic.com/media/424cb5_5f0671c63bf64ebbaa880682815cecf6~mv2.png/v1/fill/w_980,h_551,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/424cb5_5f0671c63bf64ebbaa880682815cecf6~mv2.png)
Jensen Huang, in a display of stamina reminiscent of Taylor Swift’s dynamic performances, holds 18,000 people’s (in the stadium) attention during his solo two-hour keynote. Just as Swift captivates her audience with glitter outfits and dazzling performances, Huang takes the stage with his iconic black-leather jacket unveiling the latest and greatest innovations. Title credit goes to @Jim Fan
Sharing a few GTC 2024 takeaways. Part I: NVIDIA is moving up the tech infrastructure stack.
The company’s goal isn’t to edge out Independent Software Vendors (ISVs), AI model developers, or cloud services providers, instead NVIDIA aims to foster a foundry-like ecosystem, one that thrives on open-source, APIs, and SDKs to turbocharge the implementation of Generative AI (GenAI) across a myriad of industries, sectors, and use cases growing exponentially at the rate of N^2.
During his keynote, Jensen Huang unveiled the NVIDIA Inference Services (NIM), a collection of optimized cloud-native microservices that simplify the deployment of GenAI models across cloud, data center, and GPU-accelerated workstations. NIM stands as a turnkey solution, a pre-built containerized inference microservices that contains everything an organization needs to get started with model deployment.
Why does this matter? As more organizations shift their focus from proof of concepts to production deployments of AI models, the hurdles of managing training, inference, latency, throughput, monitoring, security and more, and doing it at scale, is not a small feat. NIM abstracts away the complexities of AI model development and packaging for production using industry-standard APIs, essentially accelerating developer adoption at scale and, as a result spurring the demand for more GPUs.
The goal is to make all the APIs free, created by NVIDIA or its ecosystem, and will run on NVIDIA GPUs only. There you have it, there is no such thing as a free lunch. However, companies that put developer experience first by stripping away complexity and facilitating fast time-to-market possess the right ingredients for success in this rapidly evolving landscape.
NIM is available as part of NVIDIA AI Enterprise license. To get started, visit free resources on ai.nvidia.com.
Another announcement is DGX Cloud, a new hosting service that gives enterprises access to the necessary software, hardware, and services needed to train advanced models for GenAI. In today’s as-a-service world, this is great for organizations looking for an easier way to get started with AI implementations.
![](https://static.wixstatic.com/media/424cb5_0b41867b242e4a8d9d675535eea05682~mv2.png/v1/fill/w_732,h_412,al_c,q_85,enc_auto/424cb5_0b41867b242e4a8d9d675535eea05682~mv2.png)
For example, researchers at Amgen, a biotechnology company, can focus on biologic therapeutic discovery instead of having to manage AI infrastructure and machine learning engineering. Additionally, customers get dedicated clusters of GPUs, a critical asset as the thirst for GPU resources shows no signs of waning.
ServiceNow is using DGX Cloud with an on-premises setup for flexible, scalable hybrid-cloud AI supercomputing that helps power its AI research on large language models, code generation, and business analysis.
I couldn’t help but wonder about the reaction from existing infrastructure-as-a-service (IaaS) providers. According to the NVIDIA team, the company is actively working with cloud service providers to host DGX Cloud Infrastructure. Oracle Cloud Infrastructure (OCI) has taken the lead. Meanwhile, Microsoft Azure is gearing up to begin hosting DGX Cloud during the 2nd half of 2024.
Comments