Today we’re joined by Tijmen Blankevoort, a staff engineer at Qualcomm, who leads their compression and quantization research teams.
Subscribe: iTunes / Google Play / Spotify / RSS
Tijmen was also the CTO at ML startup Scyfer, which he co-founded with Qualcomm colleague Max Welling, who we spoke with back on episode 267. In our conversation with Tijmen, we discuss the ins and outs of compression and quantization of ML models, including how much models can actually be compressed, and the best way to achieve it. We also look at the recent “Lottery Hypothesis” paper and how that factors into this research, and best practices for training efficient networks. Finally, Tijmen recommends a few algorithms for those interested, including tensor factorization and channel pruning.
Thanks to our Sponsor
A hearty thanks to our friends at Qualcomm for once again sponsoring today’s show! As you’ll hear in the conversation with Tijmen, Qualcomm has been actively involved in AI research for well over a decade, leading to advances in power-efficient on-device AI through efficient neural network quantization and compression, and more. Of course, Qualcomm AI powers some of the latest and greatest Android devices with their Snapdragon chipset family. From this strong foundation in the mobile chipset space, Qualcomm now has the goal of scaling AI across devices and making AI ubiquitous.
To learn more about what Qualcomm is up to, including their AI research, platforms and developer tools, visit twimlai.com/qualcomm.
From the Interview
- Paper: Relaxed Quantization for Discretized Neural Networks
- Paper: Data-Free Quantization through Weight Equalization and Bias Correction
- Paper: EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis
- Paper: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Many of you are aware that we’ve been hosting a couple of paper-reading meetups in conjunction with the podcast. I’m excited to share that Matt Kenney, Duke staff researcher and long-time listener and friend of the show, has stepped up to help take this group to the next level. The paper reading meetup will now be meeting every other Sunday at 1 PM Eastern Time to dissect the latest and greatest academic research papers in ML and AI. If you want to take your understanding of the field to the next level, check twimlai.com/meetup for more upcoming community events.
We’ve also got a couple of study groups currently running, one working through the fast.ai Deep Learning from the Foundations course, another on fast.ai Natural Language Processing, and another working through the Stanford cs224n Deep Learning for Natural Language Processing course. These study groups will be working on these courses through October and November, so it’s not too late to join. Sign up on the meetup page at twimlai.com/meetup.
Check it out
- Register for TWIMLcon: AI Platforms now!
- Download our AI Platforms eBook Series!
- For more series like this one, visit the TWIML Presents: page!
- Join the Meetup
- Register for the TWIML Newsletter
“More On That Later” by Lee Rosevere licensed under CC By 4.0