On today’s show I chat with Song Han, assistant professor in MIT’s EECS department, about his research on Deep Gradient Compression.
Subscribe: iTunes / SoundCloud / Google Play / Stitcher / RSS
In our conversation, we explore the challenge of distributed training for deep neural networks and the idea of compressing the gradient exchange to allow it to be done more efficiently. Song details the evolution of distributed training systems based on this idea, and provides a few examples of centralized and decentralized distributed training architectures such as Uber’s Horovod, as well as the approaches native to Pytorch and Tensorflow. Song also addresses potential issues that arise when considering distributed training, such as loss of accuracy and generalizability, and much more.
Happy Birthday 2018!!
A few weeks ago, at the TWiML AI Summit, I spoke with a listener who shared some interesting ways that his business, a world-leading energy company, has directly benefited from what he’s learned on the podcast. It’s been incredibly exciting to hear stories like this from listeners. To celebrate our second anniversary, We’d really like to hear from you about ways that the podcast has helped you at work or in school or transitioning between the two, how it’s helped you find or connect to resources you’ve found valuable, or educated you about something new.
You can submit your written comments to twimlai.com/2av, or call us at (636) 735-3658, and leave a voicemail (leave us your name/email, we’ll edit it out!) with your story!
Interested in the Fast.AI Deep Learning course?
Starting this Friday, I’ll be working through the Fast AI Practical Deep Learning for Coders course and in turn I’m organizing a study and support group via the Meetup. This is a great course and Fast.ai co-founder Jeremy Howard encouraged our group on twitter noting that groups that take the course together have a higher success rate, so let’s do this!
Three simple steps to join:
1. Sign up for the Meetup, noting fast.ai in the “What you hope to learn” box
2. Using the email invitation you’ll receive to join our Slack group, and
3. Once you’re there joining the #fast_ai channel.
Mentioned in the Interview
- Deep Gradient Compression
- Scaling Machine Learning at Uber with Mike Del Balso – Talk #115
- Stochastic Gradient Descent
- Data Parallelism
- Exploding Gradients Explained
- Join us in celebrating our 2nd Birthday!
- TWiML Presents: Series page
- TWiML Events Page
- TWiML Meetup
- TWiML Newsletter
“More On That Later” by Lee Rosevere licensed under CC By 4.0