Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback

EPISODE 6

SEPTEMBER 29, 2016

Watch

Banner Image: Angie Hugeback - Podcast Interview

Facebook

About this Episode

My guest this time is Angie Hugeback, who is principal data scientist at Spare5. In this show, Angie and I discuss the real-world practicalities of generating training datasets.

This week's podcast is sponsored by Spare5 (now Mighty AI). Spare5 helps customers generate the high-quality labeled training datasets that are so crucial to accurate machine learning models.

Angie and I talk through the challenges faced by folks that need to label training data, and how to develop a cohesive system for achieving performing the various labeling tasks you're likely to encounter. We discuss some of the ways that bias can creep into your training data and how to avoid that. We explore the some of the popular 3rd party options that companies look at for scaling training data production, and how they differ. And, Angie gives us her top 3 tips for folks tasked with generating training data for AI.