Adversarial Attacks Against Reinforcement Learning Agents with Ian Goodfellow & Sandy Huang

800 800 The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

In this episode, I’m joined by Ian Goodfellow, Staff Research Scientist at Google Brain and Sandy Huang, Phd Student in the EECS department at UC Berkeley, to discuss their work on the paper Adversarial Attacks on Neural Network Policies.

If you’re a regular listener here you’ve probably heard of adversarial attacks, and have seen examples of deep learning based object detectors that can be fooled into thinking that, for example, a giraffe is actually a school bus, by injecting some imperceptible noise into the image. Well, Sandy and Ian’s paper sits at the intersection of adversarial attacks and reinforcement learning, another area we’ve discussed quite a bit on the podcast. In their paper, they describe how adversarial attacks can also be effective at targeting neural network policies in reinforcement learning. Sandy gives us an overview of the paper, including how changing a single pixel value can throw off performance of a model trained to play Atari games. We also cover a lot of interesting topics relating to adversarial attacks and RL individually, and some related areas such as hierarchical reward functions and transfer learning. This was a great conversation that I’m really excited to bring to you!

TWIML Online Meetup Update

I’d like to send a huge shout out to everyone who participated in the TWIML Online Meetup earlier this week. In our community segment we had a very fun and wide ranging discussion about freezing your brain (and if you missed that startup’s announcement you probably have no idea what I’m talking about), ML and AI in the healthcare space, and more. Community member Nicholas Teague,‏ who goes by @_NicT_ on twitter, also briefly spoke about his essay “A Toddler Learns to Speak”, where he explores connections between different modalities in machine learning. Finally, a hearty thank you to Sean Devlin, who presented a deep dive on Deep Reinforcement Learning and Google DeepMind’s seminal paper in the space. Be on the lookout for the video recording and details on next month’s meetup at

Conference Update

You all know I travel to a ton of events each year, and event season is just getting underway for me. One of the events I’m most excited about is my very own AI Summit, the successor to the awesome Future of Data Summit event I produced last year. This year’s event takes place April 30th and May 1st, and is once again being held in Las Vegas, in conjunction with the Interop ITX conference.

This year’s event is much more AI focused, and is targeting enterprise line-of-business and IT managers and leaders who want to get smart on AI very quickly. Think of it as a two-day, no-fluff, Technical MBA in machine learning & AI. I’ll be presenting an ML & AI bootcamp, and I’ll have experts coming in to present mini workshops on computer vision, natural language processing and conversational applications, ML and AI for IoT and industrial applications, data management for AI, building an AI-first culture in your organization, and operationalizing ML and AI. For more information on the program visit

About Ian

About Sandy

Mentioned in the Interview

“More On That Later” by Lee Rosevere licensed under CC By 4.0

  • V.

    It is incredible how this show decided to publicize the derivative, potentially plagiarized work of a UC Berkeley paper rather than the original ( ). As a long-time listener of this podcast, I am deeply disappointed in TWIMLAI. It’s such disheartening actions that turn so many away from doing real work in academia.

    • sam

      Hi Vahid,

      Thanks for writing in and for being a listener of the show! I really appreciate both!

      I’m not an academic and don’t claim to know much about the frustrations of that world, but I can certainly understand yours, having published your paper in the same area a few weeks prior to the one we discussed in this show. The authors of this paper do acknowledge your paper in theirs, but I imagine that to be little consolation.

      I think there’s room–need really–for a multiplicity of voices in this field. That’s really a core premise of the podcast. Perhaps we can chat sometime about your continued work in this area.


      • Vahid Behzadan


        Almost all of your episodes are about academic papers, hence having a good understanding of academic practices will certainly come handy in your work.

        I hold the highest level of respect for all the coauthors of this particular paper: Pieter Abbeel is one of the pioneers of deep RL, and if it weren’t for Ian Goodfellow and Nicolas Papernot, research on adversarial examples would be far behind where it is now. I’m certain that Sandy will also turn into a good researcher by the end of her studies.

        But, what you have done in this episode is giving praise to a work that clearly does not deserve it. Huang et al. had seen our work when they published theirs. What they offer is one finding of the four we reported in our work. The only novelty in their work is confirming one of our findings in more test cases. The ethical practice in such cases is to outright mention that their work is not novel and only further confirms previous cases – not to knowingly make the false claim that “our work is the first to study the ability of an adversary to interfere with the operation of an RL agent by presenting adversarial examples at test time” right after they misleadingly cite the first study on the exact same phenomenon and even more.

        Now, why am I so outraged about this regular academic malpractice? The media coverage that their work received is going to translate into winning more research grants and academic credit for an institution that clearly does not need it as much as those that are lesser known. UC Berkeley and all of the 5 coauthors of this paper already have access to immense computational and technical resources. Back when I was working on this problem alone (note the difference in the number of authors in both papers), I had access to neither and had to pay out of my own tight pocket for an Amazon EC2 GPU instance. A year later, my advisor and I are fighting for smallest grants out there to establish the AI Safety Research Center ( ) at a lesser known university, while having to endure other better known groups confiscating the credit for our work to enhance their profile and secure more grants. I hope you can see how this podcast contributed to a fundamental failure in “multiplicity of voices” in this field.

        I believe that this wasn’t, neither won’t be the last time that TWIMLAI makes this unintended mistake. Validating the authenticity and originality of papers is something that honest academics do on a daily basis. If you’d like, I am more than willing to spare a couple of hours each week to help your promising program with this and making sure that it does more good than harm in AI research.



Leave a Reply

Your email address will not be published.