Neuroevolution: Evolving Novel Neural Network Architectures with Kenneth Stanley

    800 800 This Week in Machine Learning & AI

    Today, I’m joined by Kenneth Stanley, Professor in the Department of Computer Science at the University of Central Florida and senior research scientist at Uber AI Labs.

    Kenneth studied under TWiML Talk #47 guest Risto Miikkulainen at UT Austin, and joined Uber AI Labs after Geometric Intelligence , the company he co-founded with Gary Marcus and others, was acquired in late 2016. Kenneth’s research focus is what he calls Neuroevolution, applies the idea of genetic algorithms to the challenge of evolving neural network architectures. In this conversation, we discuss the Neuroevolution of Augmenting Topologies (or NEAT) paper that Kenneth authored along with Risto, which won the 2017 International Society for Artificial Life’s Award for Outstanding Paper of the Decade 2002 – 2012. We also cover some of the extensions to that approach he’s created since, including, HyperNEAT, which can efficiently evolve very large networks with connectivity patterns that look more like those of the human and that are generally much larger than what prior approaches to neural learning could produce, and novelty search, an approach which unlike most evolutionary algorithms has no defined objective, but rather simply searches for novel behaviors. We also cover concepts like “Complexification” and “Deception”, biology vs computation including differences and similarities, and some of his other work including his book, and NERO, a video game complete with Real-time Neuroevolution. This is a meaty “Nerd Alert” interview that I think you’ll really enjoy.

    Giveaway Update!

    Thanks to everyone who took the time to enter our #TWiML1MIL listener giveaway! We sent out an email to entrants a few days ago, so please be on the lookout for that. If you haven’t heard from us yet, please reach out to us at so that we can get you your swag!

    TWiML Online Meetup

    The details for our January Meetup are set! Tuesday, January 16, we will be joined by former TWiML guest and Microsoft Researcher Timnit Gebru. Timnit joined us a few weeks ago to discuss her recently released, and much acclaimed paper, “Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States”, and I’m excited that she’s be joining us to discuss the paper, and the pipeline she used to identify 22 million cards in 50 million Google Street View images, in more detail. I’m anticipating a lively discussion segment, in which we’ll be exploring your AI resolutions & predictions for 2018. For links to the paper, or to register for the meetup, or to check out previous meetups, visit

    About Kenneth

    Mentioned in the Interview

    • Joe Perez

      This is one of the best interviews delivered by Twiml & AI ever. It ranks close to Francisco Weber and Mat Taylor. It once more makes it clear how foolish it is to try to navigate towards strong AI without using the map which nature (biology and neurosciences) provides us with. It may be true, that many different paths lead to Rome, but why do the majority insist on getting there blindfolded, when nature delivers us a beautiful map with all the turns, mountains and valleys on the way there? Please do not misunderstand my critical remarks. I admire and appreciate all the Mathematical and Statistical geniuses out there that are so wise, they can even make a flawed neurological model work better than most natural ones at certain narrow AI tasks. But firing the “Linguist” is not the path toward AGI. We need more Neuroscientist, Biologists, Linguists and domain specialists to hold the torch for the “mainstream AI” specialists.

      I would also love to listen to an interview with Jeff Hawkins (Numenta), the author of “On Intelligence”. His insights have been quoted and referred to in other interviews, but have not been given enough attention to yet. Especially how the changes to the neuroscientific paradigm shed light on many of the shortcomings of other current more popular, narrow approaches, which would benefit from taking into account this newer paradigm (i.e. the 3 states including the predictive state instead of only 2 states of a neurons in present models, the cortical column as unit of representation, instead of individual neurons, the additive, distal and temporal effects on activation, instead of the sygmoid activation function of single synapses, the concept of sparse distributed representation, SDR, as a foundation for classification and inference efficiency, instead of dense knowledge representation models still in use). Is it not arrogant of us, to ignore millions of years of the evolution with the pretense that we can outwit nature? Even if we could, we would still owe this to the very nature we are denying our attention to. Thank you for this great interview with Dr. Kenneth Stanley. And thank you to him, for sharing his wonderful insights with the wor

      • sam

        Glad to hear you enjoyed it, Joe!

        I agree with your comments regarding the need for multiple disciplines to contribute to making progress in AI. Yael Niv spoke to this point as well in our recent interview from NIPS. I find the work that Numenta is doing to create a more biologically “true” neurological model both intriguing and important, but at the same time I’ve asked a bunch of times and haven’t gotten a very clear answer to the question “what is the killer app?” for these models; in other words, where does it outperform all other approaches, for some arbitrary definition of outperform that can include both hard metrics like accuracy as well as softer ones like ease-of-use. That said, getting Jeff on the show is a great idea and one we’ll work on!

        Thanks for your comment!

    • Joe

      Dear Sam, Thank you for your response and remarks. It is such an honor to be able to correspond with you. I work in the IT area at Volkswagen in Germany, at the headquarters in Wolfsburg. I listen to your podcast most of my commutes and feel highly enriched in my knowledge about the field of AI, especially at the leading edge, which you so uniquely cover. And all this while enjoying my driving, which is still not completely autonomous, but best assisted by your content. 😄 I also highly recommend to my colleagues to listen to your podcasts. I have posted the links and some comments in my internal blog. Some of them are now listening also. I also want to mention, that I highly appreciate that you include all disciplines and approaches towards this very young field of research. Even though I am a little critical of those who tend to ignore the interdisciplinary opportunities which lie at our fingertips, if we are open minded and not afraid of learning from other disciplines than our own area of experience. In general, all the researchers in this field, especially in your interviews, are very open to new insights and certainly not afraid of the new. It is also not bad for some to maximize their exploitation of certain techniques, even though they could profit from other hybrid approaches. By doing so, they explore the realm of what is achievable in each particular approach. I would just like to remind them, once in a while, that our brain is pulling off wonderful performance with only around 10 watts of energy, if I am not mistaken. So there must still be a bit to learn from nature and 6 million years of evolution. Regarding Numenta and Jeff Hawkins achievements, I understand your concern with pin-pointing the areas of advantange. As a member of their open source community, mostly following their research and internal discussion forums and reading everything they publish, I can assure you that my instinct is highly biased toward their basic premises and paradigm. But the reasons are less tangable in terms of Key Performance Indicators or the like. It is precisely the holistic embracement of a multitude of factors, that gives me this certainty. For one, everything that Francisco Weber at is successfully exploiting is based on the principle of SDRs (Sparse Distributed Representations) which he calls Semantic Footprints. The classification efficiency which Francisco claims is far superior to other approaches, especially in terms of energy usage, is one very direct result derived from Jeff Hawkins paradigm. The HTM concept (Hierarchical Temporal Memory) at the center of Numenta also describes the same bottom-up and Top-Down bi-directional predictive influence in the layers described by others in your recent podcasts (I think also from NIPS). This concept was published by Jeff back in 2004 if I am not mistaken. It is the self-organizing, universal applicability and high plasticity of this biological model that gives it it’s special value, especially if your goal is AGI in the long-term. But it must be said, that Numenta strives to verify the accuracy and plausibility of the neuro-biological models as their primary goal. Their secondary goal is to be a catalyzer for the insights, discoveries and breakthroughs they achieve along the way. Their commercial business model is very noble from this point of view. They bring fundamental research in computational neuroscience (a field they coined) to the table of all AI, IT and other displines for best use as open source. Due to the complexity of the task of modelling the neo-cortex, they are focussing strongly on motor-sensory paradigms grounded on research coming from the wet-labs with real brain-matter. This is an invaluable contribution to the world. I also rememend looking into them to all my colleagues, but this is a more dificult argument for those of us, driven by shorter term goals. Thank you for your work and wish you good luck with an interview with Jeff Hawkins. Your series will always have a very important gap until you get him on the podcast. Your loyal follower, Joe Perez

      • sam

        Thanks, Joe! I appreciate your reply. I hope I didn’t word my “performance concerns” too strongly. I am actually a bit torn on this point. One the one hand, of course, it would be great to see tangible results to know we’re on the right path and to truly judge the promise of their approach. To the extend that Francisco and Cortical are able to achieve that then it is certainly a strong proof point. (I need to dig further into their latest results.) At the same time, sometimes technologies and approaches take time to mature and we need to judge their promise more intuitively. Put another way, our current approach to deep learning (for example) could be (likely is) a local optima in machine learning. I’d hope we’re willing to invest in alternatives that will allow us to improve upon it at some future time. Also, this reminds me of a comment I saw in a recent Reddit AMA with Hinton’s team on Capsule Networks. Someone asked (paraphrasing) why they benchmarked performance on MNIST and why they published without SOTA results on this task and Nicholas Frosst responded that Hinton has so many bad ideas that benchmarking and achieving decent results on MNIST is a filter that lets him know if something is even plausible/worth pursuing. Seen from this perspective, HTM has certainly achieved this level of plausibility and will hopefully lead us someplace very interesting.


    Leave a Reply

    Your email address will not be published.