My travel comes in waves centered around the spring and fall conference seasons. A couple of weeks ago, in spite of there being no signs of a true springtime here in St. Louis, things shifted into high gear with me attending the Scaled ML conference at Stanford and Nvidia GTC over the course of a few days. Following me on Twitter is the best way to stay on top of the action as it happens, but for those who missed my live-tweeting, I thought I’d reflect a bit on Nvidia and GTC. (You’ll need to check out my #scaledmlconf tweets for my fleeting thoughts on that one.)
In many ways, Nvidia is the beneficiary of having been in the right place at the right time with regards to AI. It just so happened that (a) a confluence of advances in computing, data, and algorithms led to explosive progress and interest in deep neural networks, and (b) that our current approach to training these depends pretty heavily on mathematical operations that Nvidia’s graphics cards happened to be really efficient at.
That’s not to say that Nvidia hasn’t executed extremely well once the opportunity presented itself. To their credit, they recognized the trend early and invested heavily in it, before it really made sense for them to do so, besting the “innovator’s dilemma” that’s caused many a great (or formerly great) company to miss out.
Nvidia has really excelled in developing software and ecosystems that take advantage of their hardware and are deeply tailored to the different domains in which it’s being used. This was evidenced in full at GTC 2018, with the company rolling out a number of interesting new hardware, software, application, and ecosystem announcements for its deep learning customers.
A few of the announcements I found most interesting were:
New DGX-2 deep learning supercomputer
After announcing the doubling of the V100 GPU memory to 32GB, Nvidia unveiled the DGX-2, a deep-learning optimized server containing 16 V100s and a new high-performance interconnect called NVSwitch. The DGX-2delivers 2 petaFLOPS of compute power and offers significant cost and energy savings relative to traditional server architectures. For a challenging representative task like training a FAIRSeq neural machine translation (NMT) model, the DGX-2 completed the task in a day and a half, versus the previous generation DGX-1’s 15 days.
Deep learning inference and TensorRT 4
Inference (using DL models, versus training them) was a big focus area for Nvidia CEO Jensen Huang. During his keynote, Jensen spoke to the rapid increase in complexity of AI models and offered a mnemonic for thinking about the needs of inference systems both in the datacenter and at the edge–PLASTER, for Programmability, Latency, Accuracy, Size, Throughput, Energy Efficiency, and Rate of Learning. To meet these needs, he announced the release of TensorRT 4, the latest version of its software for optimizing inference performance on Nvidia GPUs.
The new version of TensorRT has been integrated with TensorFlow and also includes support for the ONNX deep learning interoperability framework, allowing it to be used with models developed with the PyTorch, Caffe2, MxNet, CNTK, and Chainer frameworks. The new version’s performance was highlighted, including an 8x increase in TensorFlow performance when used with TensorRT 4 vs TensorFlow alone and 45x higher throughput vs. CPUs for certain network architectures.
New Kubernetes support
Kubernetes (K8s) is an open source platform for orchestrating workloads on public and private clouds. It came out of Google and is growing very rapidly. While the majority of Kubernetes deployments are focused on web application workloads, the software has been gaining popularity among deep learning users. (Check out my interviews with Matroid’s Reza Zadehand OpenAI’s Jonas Schneider for more.)
To date, working with GPUs in Kubernetes has been pretty frustrating. According to the official K8s docs, “support for NVIDIA GPUs was added in v1.6 and has gone through multiple backwards incompatible iterations.” Yikes! Nvidia hopes its new GPU Device Plugin (confusingly referred to as “Kubernetes on GPUs” in Jensen’s keynote) will allow workloads to more easily target GPUs in a Kubernetes cluster.
New applications: Project Clara and DRIVE Sim
Combining its strengths in both graphics and deep learning, Nvidia shared a couple of interesting new applications it has developed. Project Clara is able to create rich cinematic renderings of medical imagery, allowing doctors to more easily diagnose medical conditions. Amazingly, it does this in the cloud using deep neural networks to enhance traditional images, without requiring updates to the three million imaging instruments currently installed at medical facilities.
DRIVE Sim is a simulation platform for self-driving cars. There have been many efforts to train deep learning models for self-driving cars using simulation, including using commercial games like Grand Theft Auto. (In fact, the GTA publisher has shut several of these efforts down for copyright reasons). Training a learning algorithm on synthetic roads and cityscapes hasn’t been the big problem though. Rather, the challenge has been that models trained on synthetic roads haven’t generalized well to the real world.
I spoke to Nvidia chief scientist Bill Dally about this and he says they’ve seen good generalization by incorporating a couple of techniques proven out in their research, namely by combining real and simulated data in the training set and by using domain adaptation techniques, including this one from NIPS 2017 based on coupled GANS. (See also the discussion around a related Apple paper presented at the very first TWIML Online meetup.)
Impressively, for as much as Nvidia announced for the deep learning user, the conference and keynote also had a ton to offer their graphics, robotics and self-driving car users, as well as users from industries like healthcare, financial services, oil and gas, and others.
Nvidia is not without challengers in the deep learning hardware space, as I’ve previously written, but the company seems to be doing all the right things. I’m already looking forward to next year’s GTC and seeing what the company is able to pull off in the next twelve months.
Sign up for our Newsletter to receive this weekly to your inbox.