Neural Network Quantization and Compression | TWIML - The Voice of Machine Learning & AI

About this Episode

Today we're joined by Tijmen Blankevoort, a staff engineer at Qualcomm, who leads their compression and quantization research teams. Tijmen was also the CTO at ML startup Scyfer, which he co-founded with Qualcomm colleague Max Welling, who we spoke with back on episode 267. In our conversation with Tijmen, we discuss the ins and outs of compression and quantization of ML models, including how much models can actually be compressed, and the best way to achieve it. We also look at the recent "Lottery Hypothesis" paper and how that factors into this research, and best practices for training efficient networks. Finally, Tijmen recommends a few algorithms for those interested, including tensor factorization and channel pruning.

About the Guest

Tijmen Blankevoort

Qualcomm

Connect with Tijmen

Neural Network Quantization and Compression with Tijmen Blankevoort

About this Episode

About the Guest

Tijmen Blankevoort

Resources