Single Headed Attention RNN: Stop Thinking With Your Head with Stephen Merity

800 800 The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Today we’re joined by Stephen Merity, startup founder and independent researcher, with a focus on NLP and Deep Learning.

Late last month, Stephen released his latest paper, Single Headed Attention RNN: Stop Thinking With Your Head, which we break down extensively in this conversation. Stephen details his primary motivations behind writing the paper; the fact that NLP research has been recently dominated by the use of transformer models, and the fact that these models are not the most accessible/trainable for broad use. We discuss the architecture of transformers models, and how he came to the decision of using SHA-RNNs for his research, how he built and trained the model, his approach to benchmarking, and finally his goals for the research in the broader research community.

About Stephen

Mentioned in the Interview

  • Paper:Single Headed Attention RNN: Stop Thinking With Your Head
  • Single Headed Attention RNN – “Stop thinking with your head” Github Repo
  • Common Crawl
  • OpenAI Unsupervised Sentiment Neuron
  • Dissecting the Controversy around OpenAI’s New Language Model
  • Environmental Impact of Large-Scale NLP Model Training with Emma Strubell
  • LSTM’s, plus a Deep Learning History Lesson with Jürgen Schmidhuber
  • Hutter Prize for Compressing Human Knowledge
  • Check it out

    “More On That Later” by Lee Rosevere licensed under CC By 4.0

    Leave a Reply

    Your email address will not be published.