Scaling Model Training with Kubernetes at Stripe with Kelley Rivoire

EPISODE 272
LISTEN
Banner Image: Kelley Rivoire - Podcast Interview
Join our list for notifications and early access to events

About this Episode

Today we're joined by Kelley Rivoire, engineering manager working on machine learning infrastructure at Stripe.

Kelley and I caught up at a recent Strata Data conference where she presented the talk "Scaling model training: From flexible training APIs to resource management with Kubernetes." In our conversation, we discuss Stripe's machine learning infrastructure journey, including their start from a production focus as opposed to focusing on answering internal business questions. Kelley also details a few of their internal tools including Railyard, an API built to manage model training at scale. Finally, we discuss how the end users dealt with the shift to event-based, streaming models.

Connect with Kelley
Read More

Related Episodes

Related Topics

More from TWIML

Leave a Reply

Your email address will not be published.