Snorkel: A System for Fast Training Data Creation with Alex Ratner

800 800 This Week in Machine Learning & AI

Today we’re joined by Alex Ratner, Ph.D. student at Stanford, to discuss his work on Snorkel, a framework for creating training data with weak supervised learning techniques.

With Snorkel, Alex and his team hope to tackle the ever-present issue of having large data sets available by having users instead write a set of labeling functions, or scripts that programmatically label data. In our conversation, we discuss the original inspiration for Snorkel and some of the projects they’ve undertaken since it’s inception. We also discuss some of the papers that have been presented at various conferences, that used Snorkel for training data, including Kunle Olokotun’s “Software 2.0” presentation that we broke down in our 2018 NeurIPS series.

Celebrate with us!

In case you missed last weeks show, we just celebrated our third birthday and our 5 millionth download! To keep the party going, we want to hear YOUR TWIML story! In our conversations with listeners over time, we’ve heard quite a few stories detailing what they’ve learned from the podcast, and how they’ve applied it to their own work, research, philosophy, and life, and now we want to hear from you! Leave a comment here or leave a voicemail at 1-636-735-3658‬, letting us know your favorite tidbit that you’ve been able to take from the podcast, and how you’ve applied that to what you do. Everyone who joins in will be sent a limited edition 3rd birthday TWIML sticker, and the best submissions have a chance to be featured in an upcoming episode of the show.

Thanks to our Sponsor!


Before we dive in, I’d like to send a giant thanks to our friends over at SigOpt. They’ve been huge supporters of my work in this area, and I’m excited to have them as a sponsor of this series of shows on ML and AI Platforms. If you don’t know SigOpt, I spoke with their CEO Scott Clark back on show #50. Their software is used by enterprise teams to standardize and scale machine learning experimentation and optimization across any combination of modeling frameworks, libraries, computing infrastructure and environment. Teams like Two Sigma, who we’ll hear from later in this series, rely on SigOpt’s software to realize better modeling results much faster than previously possible. Of course, to fully grasp its potential it is best to try it yourself. This is why SigOpt offering you, the TWIML community, an exclusive opportunity to try their product on some of your toughest modeling problems for free. To take advantage of this offer by visit twimlai.com/sigopt!

About Alex

Mentioned in the Interview

“More On That Later” by Lee Rosevere licensed under CC By 4.0

Leave a Reply

Your email address will not be published.