Today we’re joined by Subarna Sinha, Machine Learning Engineering Leader at 23andMe.
Subscribe: iTunes / Google Play / Spotify / RSS
23andMe handles a massive amount of genomic data every year from its core ancestry business but also uses that data for disease prediction, which is the core use case we discuss in our conversation.
Subarna talks us through an initial use case of creating an evaluation of polygenic scores, and how that led them to build an ML pipeline and platform. We talk through the tools and tech stack used for the operationalization of their platform, the use of synthetic data, the internal pushback that came along with the changes that were being made, and what’s next for her team and the platform.
Thanks to our Sponsor!
Thanks to our friends at Pachyderm for sponsoring the show!
At the end of the day, real-world machine learning is all about the data. You already know this, but manually cleaning and transforming data can be exhausting, inconsistent, and error-prone, and is not the path towards getting your models into production, especially when your data, models, and code are constantly changing.
This is where Pachyderm can help. Pachyderm is an easy-to-use data science platform that lets you productionalize your machine learning tasks into fully-automated, end-to-end workflows, regardless of language or framework. Pachyderm provides Git-like data versioning and lineage that lets you automatically track every data change and final output result.
Right now, TWIML listeners can enjoy 20% off of Pachyderm Hub and build production-grade data science workflows in minutes, without ever having to configure a single piece of infrastructure.
Imagine being able to automate your entire data science workflow and reproduce any result from any point in time — in seconds, and with complete confidence.
Head over to pachyderm.com/TWIML to learn more and take advantage of this limited time offer.
Connect with Subarna!
- Paper: A Generalized Method for the Creation and Evaluation of Polygenic Scores
- Metaflow, a Human-Centric Framework for Data Science with Ville Tuulos
- Check out our TWIML Presents: series page!
- Register for the TWIML Newsletter
- Check out the official TWIMLcon:AI Platform video packages here!
- Download our latest eBook, The Definitive Guide to AI Platforms!
“More On That Later” by Lee Rosevere licensed under CC By 4.0