Reinforcement Learning Deep Dive with Pieter Abbeel

800 800 The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Our second guest in the Industrial AI series is Pieter Abbeel, Assistant Professor at UC Berkeley, Research Scientist at OpenAI and Cofounder of Gradescope. 

Pieter has an extensive background in AI research, going way back to his days as Andrew Ng’s first PhD student at Stanford. His research today is focused on deep learning for robotics. During this conversation, Pieter and I really dig into reinforcement learning, which is a technique for allowing robots (or AIs) to learn through their own trial and error.

Nerd Alert!!

This conversation explores cutting edge research with one of the leading researchers in the field and, as a result, it gets pretty technical at times. I try to uplevel it when I can keep up myself, so hang in there. I promise that you’ll learn a ton if you keep with it.

About the Industrial AI Series

This show is part of our Industrial AI series. I’ve mentioned my interest in industrial applications of machine learning & AI a few times on the podcast. I’ve been doing some research in the area, and I’m very close to publishing a special report on the topic. If you’re interested in learning more about this project, the report, or the podcast series as a whole, check out our Industrial AI page.

Thanks to Our Sponsor

Bonsai LogoA big thank you to Bonsai, who is supporting my work in this area. I’ve been following them since their launch just over a year ago, and I’ve been impressed with their team and technology. If you’re building AI-powered applications to optimize and control enterprise systems, check them out at Please let them know you appreciate their support of the podcast.

O’Reilly AI Meetup

I’m planning to attend the O’Reilly AI conference in my hometown, New York City, at the end of June. Please let me know if you’ll be there as well. I’m looking forward to connecting with TWIML listeners at the conference! We’re currently planning a community meetup during the event, and I’ll share details as soon as they’re ironed out. We hope you can join us at the conference. All of our listeners get 20% off of registration fees when purchasing passes for the conference using the code PCTWIML. For registration information, visit the event page!

About Pieter

Mentioned in the Interview

  • Jay Salmonson

    An excellent, informative interview! It would be great to have links to references (papers or otherwise) to a couple of the topics discussed therein. A couple in particular: an example of the use of Tensorflow to model a larger system comprising a neural net plus a plan including knowledge of the physics equations. Also, a link to a recommended paper on policy gradients would be helpful.

    Thanks again for the great interview!

  • Anthony Milbourne

    I have been listening to the podcast since about episode 5 and I really enjoy it – Thanks Sam. Although the technical details and maths often goes over my head, I would like to hear more of the deep(ish) dive discussions (like this episode). The discussion of networks with enforced ignorance was very interesting; I can’t quite get my brain around how you would train one head of a network to distinguish something while optimising the network as a whole to make that thing indistinguishable! More reading needed I guess… Thanks again.

  • Lydia Schulman

    This is a great interview that lucidly treats cutting edge research while along the way explaining fundamentals of reinforcement learning and related fields (with no math). The format is a plus in that the interviewer’s occasional questions and restatements of difficult concepts give us time to digest the material and “get it”–before moving on to the next intriguing idea.

  • Massimo

    What a great interview. Loved it.
    My favourite quote was your summary Sam. “”The technologie is ready but there’s still shortage of expertise to help folks to build out this applications…”
    That is what I’m thinking all the time. In some areas we are so advanced with ML. And on the other side you see so many manual PC work is done in smaller bussiness. Literally spoken, if you show them cut&paste they will be double as productive.

Leave a Reply

Your email address will not be published.