We could not locate the page you were looking for.

Below we have generated a list of search results based on the page you were trying to reach.

404 Error
There are few things I love more than cuddling up with an exciting new book. There are always more things I want to learn than time I have in the day, and I think books are such a fun, long-form way of engaging (one where I won’t be tempted to check Twitter partway through). This book roundup is a selection from the last few years of TWIML guests, counting only the ones related to ML/AI published in the past 10 years. We hope that some of their insights are useful to you! If you liked their book or want to hear more about them before taking the leap into longform writing, check out the accompanying podcast episode (linked on the guest’s name). (Note: These links are affiliate links, which means that ordering through them helps support our show!) Adversarial ML Generative Adversarial Learning: Architectures and Applications (2022), Jürgen Schmidhuber AI Ethics Sex, Race, and Robots: How to Be Human in the Age of AI (2019), Ayanna Howard Ethics and Data Science (2018), Hilary Mason AI Sci-Fi AI 2041: Ten Visions for Our Future (2021), Kai-Fu Lee AI Analysis AI Superpowers: China, Silicon Valley, And The New World Order (2018), Kai-Fu Lee Rebooting AI: Building Artificial Intelligence We Can Trust (2019), Gary Marcus Artificial Unintelligence: How Computers Misunderstand the World (The MIT Press) (2019), Meredith Broussard Complexity: A Guided Tour (2011), Melanie Mitchell Artificial Intelligence: A Guide for Thinking Humans (2019), Melanie Mitchell Career Insights My Journey into AI (2018), Kai-Fu Lee Build a Career in Data Science (2020), Jacqueline Nolis Computational Neuroscience The Computational Brain (2016), Terrence Sejnowski Computer Vision Large-Scale Visual Geo-Localization (Advances in Computer Vision and Pattern Recognition) (2016), Amir Zamir Image Understanding using Sparse Representations (2014), Pavan Turaga Visual Attributes (Advances in Computer Vision and Pattern Recognition) (2017), Devi Parikh Crowdsourcing in Computer Vision (Foundations and Trends(r) in Computer Graphics and Vision) (2016), Adriana Kovashka Riemannian Computing in Computer Vision (2015), Pavan Turaga Databases Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases (2021), Xin Luna Dong Big Data Integration (Synthesis Lectures on Data Management) (2015), Xin Luna Dong Deep Learning The Deep Learning Revolution (2016), Terrence Sejnowski Dive into Deep Learning (2021), Zachary Lipton Introduction to Machine Learning A Course in Machine Learning (2020), Hal Daume III Approaching (Almost) Any Machine Learning Problem (2020), Abhishek Thakur Building Machine Learning Powered Applications: Going from Idea to Product (2020), Emmanuel Ameisen ML Organization Data Driven (2015), Hilary Mason The AI Organization: Learn from Real Companies and Microsoft’s Journey How to Redefine Your Organization with AI (2019), David Carmona MLOps Effective Data Science Infrastructure: How to make data scientists productive (2022), Ville Tuulos Model Specifics An Introduction to Variational Autoencoders (Foundations and Trends(r) in Machine Learning) (2019), Max Welling NLP Linguistic Fundamentals for Natural Language Processing II: 100 Essentials from Semantics and Pragmatics (2013), Emily M. Bender Robotics What to Expect When You’re Expecting Robots (2021), Julie Shah The New Breed: What Our History with Animals Reveals about Our Future with Robots (2021), Kate Darling Software How To Kernel-based Approximation Methods Using Matlab (2015), Michael McCourt
Deep reinforcement learning (DRL) is an exciting area of AI research, with potential applicability to a variety of problem areas. We’ve discussed DRL several times on the podcast to date and just this week took a deep dive into it during the TWIML Online Meetup. (Shout out to everyone who attended!) Our presenter, Sean Devlin, did a great job explaining the major ideas underlying DRL. If this week’s newsletter inspires you to dig more deeply into how RL works, the meetup recording, which will be posted shortly, would be a good place to start. First, a quick refresher on the basic idea behind reinforcement learning. Unlike supervised machine learning, which trains models based on known-correct answers, in reinforcement learning the model is trained by having an agent interact with an environment. When the agent’s actions produce desired results–for example, scoring a point or winning the game–the agent gets rewarded. Put simply, the agent’s good behaviors are reinforced.   Credit: Sean Devlin, for March 2018 TWIML Online Meetup One of the key challenges in applying DRL to non-trivial problems is in constructing a reward function that encourages desired behaviors without undesirable side effects. Another important factor is the tradeoff between taking advantage of what the agent has already learned (exploitation) and investigating new behaviors or new parts of the environment (exploration). It might be worth noting here that while deep reinforcement learning–“deep” therein referring to the fact that the underlying model is a deep neural network–is still a relatively new field, reinforcement learning has been around since the 70s, or earlier, depending on how you count. As Andrej Karpathy points out in his 2016 blog post, pivotal DRL research such as the AlphaGo paper and the ATARI Deep Q-Learning paper are based on reinforcement learning algorithms that have been around for a while, but with deep learning swapped in instead of other ways to approximate functions. Their use of deep learning is of course enabled by the explosion in inexpensive compute power we’ve seen over the past 20+ years. Some people view DRL as a path to artificial general intelligence, or AGI, because of how it mirrors human learning, that is, exploring and receiving feedback from environments. Recent successes of DRL agents in besting human players in playing video games, the well-publicized defeat of Go grandmaster at the hands of DeepMind’s AlphaGo, and demonstrations of bipedal agents learning to walk in simulation have all contributed to the enthusiasm about the field. The promise of DRL, along with Google’s 2014 acquisition of DeepMind for $500 million, has led to the formation of a number of startups hoping to capitalize on this technology. I’ve previously interviewed Mark Hammond, a founder of Bonsai, which offers a development platform for applying deep reinforcement learning to a variety of industrial use cases, and Pieter Abbeel, a founder of Embodied Intelligence, a still-stealthy startup looking to apply VR and DRL to robotics. Osaro, backed by Jerry Yang, Peter Thiel, Sean Parker, and other boldface-named investors, is also looking to apply DRL in this space. Meanwhile, Pit.ai is seeking to best traditional hedge funds by applying DRL to algorithmic trading, and DeepVu is applying DRL to the challenge of managing complex enterprise supply chains. As a result of increased interest in DRL, we’ve also seen the creation of new open-source toolkits and environments for training DRL agents. Most of these frameworks are essentially special-purpose simulation tools or interfaces thereto. Here are some of the new open-source toolkits and environments I’m tracking: OpenAI Gym. OpenAI Gym is a popular toolkit for developing and comparing reinforcement learning models. Its simulator interface supports a variety of environments including classic Atari games, and robotics and physics simulators like DARPA-funded Gazebo, and MuJoCo. Like other DRL toolkits, it offers APIs to feed observations and rewards back to agents. DeepMind Lab DeepMind Lab is a 3D learning environment based on the Quake III first-person shooter video game, offering up navigation and puzzle-solving tasks for learning agents. DeepMind recently added DMLab-30, a collection of new levels, and introduced their new IMPALA distributed agent training architecture. Psychlab Another DeepMind toolkit, open-sourced earlier this year, Psychlab extends DeepMind Lab to support cognitive psychology experiments like searching an array of items for a specific target or detecting changes in an array of items. Human and AI agent performance on these tasks can then be compared. House 3D A collaboration between Berkeley and Facebook AI researchers, House3D offers over 45,000 simulated indoor scenes with realistic room and furniture layouts. The primary task covered in the paper that introduced House3D was “concept-driven navigation,” e.g. training an agent to navigate to a room in a house given only a high-level descriptor like “dining room.” Unity Machine Learning Agents. Under the stewardship of VP of AI and ML Danny Lange, game engine developer Unity has been making an effort to incorporate cutting-edge AI technology into their platform. Unity Machine Learning Agents, released last fall, is an open-source Unity plugin that enables games and simulations running on the platform to serve as environments for training intelligent agents. Ray While the other tools listed here focus on DRL training environments, Ray is more about the infrastructure of DRL at scale. Developed by Ion Stoica and his team at the Berkeley RISELab, Ray is a framework for efficiently running Python code on clusters and large multi-core machines, specifically targeted at providing a low-latency distributed execution framework for reinforcement learning. The advent of all these tools and platforms will make DRL more accessible to developers and researchers. They’ll need all the help they can get, though, because deep reinforcement learning can be challenging to put into practice. A recent critique by Google engineer Alex Irpan, in his provocatively-titled article, “Deep Reinforcement Learning Doesn't Work Yet,” explains why. Alex cited the large amount of data required by DRL, the fact that most approaches to DRL don’t take advantage of prior knowledge about the systems and environments involved, and the aforementioned difficulty in coming up with an effective reward function, among other issues. I expect deep reinforcement learning to continue to be a hot topic in the AI field, both from the research and applied perspectives, for some time. It has shown great promise at handling complex, multifaceted and sequential decision-making problems, which makes it useful not just for industrial systems and gaming, but fields as varied as marketing, advertising, finance, education, and even data science itself. Are you working on deep reinforcement learning? If so, I’d love to hear how you’re applying it. Sign up for our Newsletter to receive this weekly to your inbox.
Today, I'm joined by Kenneth Stanley, Professor in the Department of Computer Science at the University of Central Florida and senior research scientist at Uber AI Labs. Kenneth studied under TWIML Talk #47 guest Risto Miikkulainen at UT Austin, and joined Uber AI Labs after Geometric Intelligence , the company he co-founded with Gary Marcus and others, was acquired in late 2016. Kenneth's research focus is what he calls Neuroevolution, applies the idea of genetic algorithms to the challenge of evolving neural network architectures. In this conversation, we discuss the Neuroevolution of Augmenting Topologies (or NEAT) paper that Kenneth authored along with Risto, which won the 2017 International Society for Artificial Life's Award for Outstanding Paper of the Decade 2002 - 2012. We also cover some of the extensions to that approach he's created since, including, HyperNEAT, which can efficiently evolve very large networks with connectivity patterns that look more like those of the human and that are generally much larger than what prior approaches to neural learning could produce, and novelty search, an approach which unlike most evolutionary algorithms has no defined objective, but rather simply searches for novel behaviors. We also cover concepts like "Complexification" and "Deception", biology vs computation including differences and similarities, and some of his other work including his book, and NERO, a video game complete with Real-time Neuroevolution. This is a meaty "Nerd Alert" interview that I think you'll really enjoy.
On the heels of last week’s $200 million acquisition by Apple of Turi, Intel announced on Tuesday yet another acquisition in the machine learning and AI space, this time with the $400 million acquisition of deep learning cloud startup Nervana Systems. This is another exciting acquisition; let’s take a minute to unpack it. First of all, for those not familiar with the company, Nervana, spelled N-E-R-vana, is a two year old company developing software, hardware and cloud services for deep learning. The company was originally founded to build hardware for speeding up deep learning, and it’s this focus that made it so attractive to Intel. The company’s first hardware product, due next year, is a custom deep learning chip called the Nervana Engine. The ASIC chip is similar in focus to the Google Tensor Processing Unit or TPU which we highlighted in the very first episode of This Week in Machine Learning & AI back in May. The company has also released a software product called Neon, and operates the Nervana Cloud. Neon is an open source deep learning framework like TensorFlow, Caffe or Theano. Relative to those others, which you hear about here on the show pretty much every week, Neon is known for being particularly fast, especially on NVIDIA GPUs. This is due to some clever optimization work the team did with the GPU firmware. Neon doesn’t have quite the popularity of some of these other frameworks, in part because it was initially a proprietary product, only recently open sourced back in May. The company’s cloud offering is tuned for running deep learning, and will eventually incorporate the company’s own chips. This is a great deal for the company’s founders and investors. With $24.4 million in funding to date, and a price reported to be as high as $408 million, Nervana returned nearly 17x to investors, which is home run territory for most VCs. At the same time, if you’ll allow me to Monday Morning Quarterback, I’m a little surprised that they’ve decided to sell so early in the game. The company is extremely well positioned in really two hot spaces, deep learning and cloud, and the team has only been at it for a couple of years. Projecting out a couple of years, it’s easy to see Nervana with a billion dollar valuation, assuming they continued to execute. This makes me wonder what the team saw in the market that said that now was the time to sell. Of course, it’s certainly the case that Intel brings a lot more to the table here than cash. The company obviously has vast resources and expertise in the chip-making arena and they could certainly help accelerate Nervana’s plans. It’s also the case though that the company faces stiff and growing competition. Google for example, offers everything Nervana does. Google’s TensorFlow, released about 8 months ago, is by most measures the most popular deep learning framework. (You’ll recall we discussed Francois Chollet’s analysis of the landscape back on the July 15 show.) Google also sees TensorFlow as becoming an on-ramp to the Google Compute Platform. And GCP has TPUs, which I just mentioned and which the company announced back in May. So perhaps the Nervana team and investors looked at the long slog ahead and decided to take the money off the table. I do wonder if the lack of an upside in terms of options makes hiring top talent more difficult for the company. So that’s the Nervana side of things, what about Intel’s side? Well, while this is a pretty small acquisition for Intel, I think it’s a smart move on their part. That’s because, despite numerous investments in the space, as recently as their investment in Nervana competitor CognitiveScale last week, Intel has been struggling to tell a story around machine and deep learning. The problem they’re facing is that NVIDIA is eating their lunch when it comes to chips for deep learning applications. In fact, NVIDIA also made news this week when they announced record revenues and a more aggressive sales outlook. The reason for the improved outlook? Quoting CEO Jen-Hsun Huang: “One particular dynamic sticks out, and it’s a very significant growth driver of where we have an extraordinary position in and it’s deep learning,” Huang told analysts in a conference call that lasted almost 80 minutes. “The last five years, we’ve quietly invested in deep learning because we believe that the future of deep learning is so impactful to the entire software industry, the entire computer industry that we, if you will, pushed it all in.” NVIDIA’s lead in deep learning has been a sore spot for Intel of late, to the point that several articles commented on interviews with company data center chief Diane Bryant where she became ruffled at the mention of Intel’s lack of presence in the machine learning market. Now, Intel and Diane are quick to shrug this off, since machine learning is a relatively nascent market. According to the MIT Technology Review, market research firm Tractica pegs the market for AI-related chips at under 1 billion, growing to 2.4 billion in 2024, a small figure compared to Intel’s 2015 revenue of $56 billion. But Intel missed the boat on mobile and PC chip sales are declining, and there’s weakness in data center and IoT revenue growth as well. So while machine learning and AI are an emerging market just at the beginning of the growth cycle, Intel can’t afford to sit this one out. This deal gives them a much needed story around deep learning and if the companies are able to execute, a foot in the door of this nascent market. Moving forward, this poses some of the same challenges I mentioned in the context of Apple/Turi, namely executive focus, but I also think this plays to several of Intel’s strengths. In particular, while I’ve seen the company struggle trying to independently build and sell enterprise software, the company does a good job of building and selling through reference architectures. If Nervana ultimately becomes a reference for how to build out a deep learning cloud using new and traditional Intel hardware combined with open source software, this could drive significant future adoption for them and begin to turn the tide. There are also a good number of possible tie-ins to take advantage of here. One is with Intel’s open source project, the Trusted Analytics Platform. Also, Intel has a significant stake in big data company Cloudera and cloud builder Mirantis. This is getting a bit ahead of ourselves, sure, but there could be some pretty interesting collaborations between these projects and companies over time. Subscribe: iTunes / Youtube / Spotify / RSS
Autonomous driving startup Comma.ai released a small dataset that lets you try your hand at building your own models for controlling a self-driving vehicle. The dataset consists 10 video clips recorded at 20 Hz from a camera mounted on the windshield of a 2016 Acura ILX. There are about 7 hours of video total, captured mostly during highway driving. Alongside the video files are a set of sensor logs where measurements such as velocity, acceleration, steering angle, GPS location and gyroscope angles are recorded. The dataset is a 45 GB compressed zip file that explodes to 80 GB when compressed. That is, if you can get it to uncompress. When I tried it, after a fairly long download, unzip complained about the file being corrupt when I tried to unzip it. The project’s github repo includes a script to download the data from archive.org as well as some simple models built in Keras and TensorFlow for predicting steering angle and creating simulated road images using generative AI. They’ve also included a paper on the latter topic. The idea is that since it’s pretty expensive to train a self-driving car on real roads, you typically want to train your algorithms in a simulator. To do that, you can either hand code a simulator or use a generative AI to create one. The paper describes the use of variational autoencoders and generative adversarial networks and an RNN to create simulated road images. You can start by running their existing models, but if you manage to do amazing things with the data, let Comma know—they’re hiring and want to meet you. Subscribe: iTunes / Youtube / Spotify / RSS
This week we discuss Intel’s latest deep learning acquisition, AI in the Olympics, image completion with deep learning in TensorFlow, and how you can win a free ticket to the O’Reilly AI Conference in New York City, plus a bunch more. Here are the notes for this week’s podcast: O’Reilly AI Conference Giveaway I’m excited to be partnered with the O’Reilly Artificial Intelligence Conference, to give away a free ticket to the event, which will be held September 26 – 27, 2016 in New York City. There are three ways to enter the giveaway: 1. (Preferred) Follow @twimlai on Twitter and retweet this tweet: Win a FREE ticket to the @OReillyAI Conference. To enter, follow @twimlai + RT. https://t.co/ReYqwqp538 for details. pic.twitter.com/9pLrzHIX9d — TWIML (@twimlai) August 15, 2016 2. Sign up for the TWIML&AI Newsletter and add a note “please enter me” in the comments field. 3. Use this site’s contact form to send me a message and use “AI contest” as the subject. A winner will be chosen at random and announced on the 9/2 podcast. Ticket is non-transferrable. Good luck, and hope to see you in New York! If you’d like to buy a ticket, register using the code PCTWIML for 20% off! And don’t forget to get your free early access ebook: Mastering Feature Engineering Intel Buys Deep Learning Startup Nervana Intel Buys a Startup to Catch Up in Deep Learning Deep Learning Chip Upstart Takes GPUs to Task Nvidia’s bet on deep learning and autonomous cars drives stock to record highs – MarketWatch AI Bot Joins Team Washington Post at the Rio Olympics The Washington Post experiments with automated storytelling to help power 2016 Rio Olympics coverage – The Washington Post Technology Fujitsu Software to Accelerate Deep Learning Workloads DetectNet: Deep Neural Network for Object Detection in DIGITS | Parallel Forall Google Research Blog: Meet Parsey’s Cousins: Syntax for 40 languages, plus new SyntaxNet capabilities Image Completion with Deep Learning Image Completion with Deep Learning in TensorFlow bamos/dcgan-completion.tensorflow: Image Completion with Deep Learning in TensorFlow [1607.07539] Semantic Image Inpainting with Perceptual and Contextual Losses [1511.06434] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
We’ve talked fairly extensively about the use of Deep Learning in medicine in previous shows. Breast cancer and eye disease were a couple of the use cases we discussed, with both of these sharing the common feature that they’re based on image analysis. Well this week a team of researchers from Princeton University published a paper outlining their work applying machine learning to the challenge of identifying genetic causes of autism. The genetic causes for autism, or autism spectrum disorder, have been difficult for researchers to track down. The autism research community has identified 65 genes associated with autism risk so far, mostly through sequencing, but it’s believed that those are but a fraction of the 400-1,000 genes likely to be involved in the disease. To try to identify the additional genetic actors in autism susceptibility, the Princeton team used what they call a brain-specific functional interaction network, which was developed in previous research. This brain-specific network is a functional map of the brain, expressed as a probabilistic graph of how genes function together in pathways in the brain. They then used machine learning to train a classifier based on the connectivity patterns of the known ASD genes in the brain-specific network, and then uses this classifier to predict the level of potential ASD association for every gene in the genome. Specifically, they used an SVM classifier, and used the connectivity of the known ASD genes to the other genes in the brain-specific network as its features. I’m somewhat trivializing the ideas around the brain-specific network and how it translates into features, mostly because I don’t really understand it. But this is a great example and reminder that most of the magic in ML is in the feature engineering. Based on their method, the team was able to identify a number of candidate genes with no prior genetic evidence of ASD association, and has since gone on to validate many of these candidate genes through sequencing. Their results can thus be used as the basis for further analysis into the genetic causes of autism. Super interesting stuff. Check it out if you’ve got a background or interest in the medical applications of ML. A couple of other interesting research papers caught my eye this week: Researchers from security research firm ZeroFOX published a paper “Weaponizing data science for social engineering: Automated E2E spear phishing on Twitter.” Spear phishing, if you haven’t heart the term is like phishing, but is targeted at a particular user. You’re typically trying to get a user to click a link that will trick them into giving up some credentials. What the ZeroFOX team did was created a tool called SNAP_R that first rates a list of Twitter users based on their likely susceptibility to a spear phishing attack, and then uses a neural network to produce effective spear phishing tweets. If you heard that and immediately thought, oh it’s probably an LSTM RNN then woo hoo, you’re catching on! At least that’s how I felt when I read that that’s exactly what they did. This next paper I love click for info. It’s basically a Twitter sarcasm detector created by researchers at the University of Lisbon in Portugal and UT Austin. It works based on embeddings, a type of word vector, which come up all the time and that I’d like to learn more about, and these embeddings are fed into a CNN model and trained on tweets that are self-identified as sarcastic by their use of the #sarcasm hashtag. The researchers use embeddings in a unique way in this paper, coupled to the different social media users, and as a result are able to outperform another recently published state-of-the-art model for sarcasm detection by over 2%. Subscribe: iTunes / Youtube / Spotify / RSS
I recently reported on the launch of the new NVIDIA TITAN X. At the time it wasn’t in the hands of any users so any thoughts on relative performance were either vendor provided or speculative. Well, a couple of researchers on the MXNet team were among the lucky folks that have their hands on the GPU at this point and they published an initial benchmark this week following the deepmark deep learning benchmarking protocol. In a nutshell they confirmed the speculation. The Pascal Titan X is about 30% faster than the GTX 1080 and its larger memory supports larger batch sizes for models like VGG and ResNet. Relative to the older Maxwell-based Titan X, the new GPU is 40-60% faster. If a single GPU isn’t enough for you, you might be interested in the new prototype announced by Orange Silicon Valley and CocoLink Corp, which they’re calling the “world’s highest density Deep Learning Supercomputer in a box.” The machine loads 20 overclocked GPUs into a single 4U rack unit offering 57,600 cores delivering 100 TeraFLOPS. The team at Orange report that an ImageNet training job that used to take one and a half days with a single NVIDIA K40 GPU can now be done in 3.5 hours using 8 GTX 1080s. The largest they’ve been able to scale a training job to is 16 GPUs, and they’re continuing to work on scaling this to the full 20 GPUs. Also in GPU news, Microsoft announced yesterday that Azure N-Series virtual machines are now available in preview. These VMs use Tesla K80 GPUs and the company claims these offer the fastest computational GPU performance in the public cloud. Moreover, unlike other cloud providers, these VMs expose the GPUs through via Discrete Device Assignment (DDA), resulting in near bare-metal performance. 6, 12 and 24 core flavors are available in the NC series of VM, which is optimized for computational workloads. An NV series that focuses more on visualization is also available, based on the Tesla M60 GPUs. Subscribe: iTunes / Youtube / Spotify / RSS
Each year, computer security conferences host a high tech version of the kids game “capture the flag,” so that teams of hackers and security researchers can demonstrate their hacking prowess. The game requires teams to secure a computer system by identifying intentional and unintentional vulnerabilities in various software modules while launching and defending against threats from competitive teams. This week, DARPA, the Defense Advanced Research Projects Agency, hosted a version of a capture the flag contest where the teams were autonomous bots. The event, held Thursday in Las Vegas as part of the Defcon security contest, was the final competition of the agency’s Cyber Grand Challenge, a $55 million hacking contest designed to spur innovation in the area of autonomous cyber warfare. Seven teams of researchers from across the country fielded bot systems that competed with one another to autonomously identify and patch software vulnerabilities that were planted in their systems by DARPA, while deflecting attacks from competing bots and launching their own attacks against the computer systems those bots were protecting. Team’s bots are scored on their ability to secure their own software and services, ensure their continued availability and take advantage of vulnerabilities in competing team’s systems. From the looks of it, DARPA constructed a pretty elaborate physical environment for the contest, complete with an “air gap” to ensure that each system was acting totally on its own. Announcers followed along with the 96 rounds of action and provided a live play-by-play for onlookers, while referees ensured that each team played by the rules. With each round, DARPA deployed a new set of software for the bots to both defend and attack. I watched segments of the 4+ hour video from the final competition and found it pretty fascinating, but I failed in my brief attempt to find any details on how the bot various bot systems work. Cade Metz’ coverage of the competition for Wired painted an interesting picture of the different strategies each bot pursued in the contest. One bot, Rubeus, built by federal contractor Raytheon, took an aggressive tack, going after vulnerabilities in the other systems from the get go. Yet another bot, Mech.Phish didn’t perform as well overall, but it did have a knack for finding and exploiting complex and subtle bugs in the challenge code. Mayhem, a bot fielded by a team from Carnegie Mellon spin-out ForAllSecure, and the eventual winner of the $2M first prize, seemed rather focused on patching its own systems and keeping them up and running. The bot reportedly used statistical analyses throughout the game to weigh the costs and benefits of patching vulnerabilities (which has inherent risks and demands service downtime), and would only decide to patch those holes that made sense based on this analysis. Cybersecurity is an important and rapidly evolving use case for ML & AI, and there’s been quite a bit of commercial activity in the area in addition to innovation and research activities like the CGC. This week startup Distil Networks closed a $21 million series C funding round to help enterprise customers separate good bots from bad ones, and keep the latter off of their web sites. Note that we’re not talking about chatbots here, but rather the kind of web bots that abuse APIs, scrape web sites, and probe them for vulnerabilities. The company uses machine learning techniques to detect when a bot is trying to cloak its activity by spoofing multiple user accounts, browsers, and locations. And last month, another cyber security startup, Darktrace Ltd. raised a $64 million series C to help enterprises identify and defend against a variety of networked threats. Subscribe: iTunes / Youtube / Spotify / RSS
News broke late last week of Apple’s acquisition of Seattle-based machine learning startup Turi, for a reported $200 million. Actually, I haven’t seen any definitive confirmation of the acquisition at the time of my initial research, but neither have there been any denials. You’ll recall we spoke about Turi just a few weeks ago, in the context of the Data Science Summit the company hosted in San Francisco, shortly after changing its name from Dato due to a legal dispute. The company, which was originally called GraphLab, was one of the first companies I started following in the machine learning platform space, and I’m pretty excited for founders Carlos Guestrin and Danny Bickson. At face value this a great deal for both companies. As we’ve discussed, Apple needs all the help it can get in machine learning and AI, and the company has over $230 billion-with-a-B sitting around in cash, so they can definitely afford it. And from Dato’s perspective, the purchase price is about 4x invested capital, so it’s a solid exit for a team of first time founders from academia in a space in which many of their contemporaries have struggled. But the question remains as to what happens next. This acquisition doesn’t really make sense if Turi is to remain an independent company—Apple needs the help internally fighting the “AI culture war,” and the company hasn’t had much success as an enterprise software player. On the other hand, in Turi CEO Carlos Guestrin, Apple could have a great ML standard bearer. Carlos is not only a business leader and a respected machine learning researcher but also a great teacher, with a popular machine learning course series on Coursera. So it’s likely that, as Techcrunch suggests, Turi discontinues offering its existing products and is reborn as a Apple’s new machine learning and AI development center. As a result, in addition to the Apple and the Turi team, winners in this deal include Seattle, which has been gaining a bit of notoriety as a cloud computing and machine learning hotspot and will also see a new influx of wealth as a result of this deal. Also Turi competitors in the machine learning platform space, folks like H2O, upstart DataRobot, and the French firm Dataiku have one less competitor to worry about and a solid exit to point to as a comparable. Dataiku, for its part, announced an update to its product, Dataiku Data Science Studio (DSS) 3.1 earlier in the week. The update adds a new support for HPE Vertica, H2O Sparkling Water, Spark MLlib, Scikit-Learn and XGBoost from within the DSS visual analysis tool, as well as integration with IBM Netezza, SAP Hana and Google Big Query on the backend. It will be interesting to see how this one plays out and I’ll keep you posted. Subscribe: iTunes / Youtube / Spotify / RSS
Good morning, First off, thanks everyone for your interest in the podcast. If you haven’t listened to the latest show, it’s a bit different than the previous ones. It’s the first in a series of interviews with folks doing interesting things in the machine learning and AI arena. I hope you find it interesting! This week the interview took the place of the regular news show, mostly because I didn’t have time to put the latter together. The news show is a ton of work, with each show taking about 24 hours to produce (down from 30+ when I started), and they can’t, by definition, be done in advance. All that said, I really believe in the format-—creating it was scratching my own itch-—so I’m working on ways to ensure it can continue uninterrupted, even when I’m traveling late in the week (as was the case this week), have other projects to attend to, or my wife gets tired of me dedicating the weekends to it (I'm starting to get that look). A couple of things I’m working on to this end are to (a) find some regular sponsors for the show and (b) find/hire someone or a small team of someones who can help me produce the show. Of course, (a) makes (b) possible, but I’m pursuing both in parallel as of now. You can help by continuing to share the podcast with your friends, review it on iTunes, post it, tweet it, etc. Ok, enough of the “inside baseball.” Here’s a quick rundown of the interesting ML and AI news for the week. Business We saw a few interesting business and product announcements this week: Shopping and travel bot startup Mezi raised $9 million in a series A financing closed this week. Investors in this round include previous investor Nexus Venture Partners and new investors Saama Capital and American Express Ventures. They've also brought on new individual investors Amit Singhal, former SVP and Head of Google Search, and Gokul Rajaram, Product Engineering Lead at Square. B12--like the vitamin I suppose--raised a $12.4 million series A. These guys are not the first guys to talk about applying AI to web site development... see The Grid for an earlier example. Like Mezi, they're also highlighting their use of hybrid AI in delivering their solution. We'll see a lot more of this type of business in the near future: startups taking traditional service-oriented businesses and sprinkling on some AI in the form of tools or automation under the covers--perhaps even just a bit to get started with. Prospera has raised $7 million to commercialize just one of many applications that will apply AI to this data. Prospera is developing a system based on computer vision and deep learning technologies that will determine when, where and how much water to deliver to crops to improve yields while conserving resources. Google introduces ML-based bid automation tools with AdWords Smart Bidding. Smart Bidding takes millions of signals into account to help users determine the best bid for a given ad unit, and it automatically refines conversion performance models to optimize deployment of customers' advertising budgets. Office 365 adds Researcher and Editor, new intelligent services to aid users writing reports and other documents in Word. Researcher is a sidebar that pulls up related articles from encyclopedias and the web based on what the user has written, and Editor is a smarter evolution of Word's spelling and grammar checkers. We've seen research sidebars in Word before and they've never proven useful, so it will be interesting to see how this one performs. Editor, on the other hand, I'd expect to be really useful, and to eventually replace the standard editorial tools in Word over time. Last, but certainly not least, Prisma, the app we talked about last time for bringing research into artistic style transfer with neural networks to the iPhone is now available on Android. I've played with it and it's pretty cool. Research OpenAI is hiring. Elon Musk-founded OpenAI is hiring researchers to work on a few "special projects". They specific research areas are: 1. Detecting if someone is using a covert breakthrough AI system in the world. 2. Building an agent to win online programming competitions. 3. Cyber-security defense. 4. Creating a complex simulation with many long-lived agents. Call me crazy, but as much as Musk says he fears an AI, the research areas here seem to be right out of an apocalyptic AI movie. Neural Network from Matroid leads in Princeton Competition. This is an interesting post describing Matroid's entry into the Princeton ModelNet competition for classifying 3D CAD models. Their application of Convolutional Neural Networks (CNNs) to this problem is interesting, and they've published a paper on their approach on arXiv. If you haven't seen the sample images from the DeepWarp Project around this week, you should check them out. A team of researchers from Skolkovo Institute of Science and Technology in Russia developed a deep learning model for creating photorealistic images from a base image where the eyes are looking in an arbitrary direction. I'd like to dig deeper into this paper at some point. Projects The Charades Data Set is an interesting set of dataset composed of nearly 10,000 videos of daily indoors activities collected by the Allen Institute for AI using Amazon Mechanical Turk. The dataset contains 66,500 temporal annotations for 157 action classes, 41,104 labels for 46 object classes, and 27,847 textual descriptions of the videos. Language modeling a billion words. An interesting project to create a generative natural language AI using LSTM RNNs trained on the Google Billion Words dataset. An interesting discussion of the techniques used to achieve scale, including the use of multiple GPUs. Bonus: Yann LeCun on Quora Yann LeCun, director of AI research at Facebook and NYU professor, did an AMA over on Quora the other day. Here are some of the responses I found interesting: What are the likely AI advancements in the next 5 to 10 years? - Quora Who is leading in AI research among big players like Google, Facebook, Apple and Microsoft? - Quora What is a plausible path (if any) towards AI becoming a threat to humanity? - Quora What are some recent and potentially upcoming breakthroughs in deep learning? - Quora Sign up for our Newsletter to receive this weekly to your inbox.
A potentially interesting survey crossed the wires this week, and I while I’m bringing it up here, I do so with caveats, because the numbers seem a bit wonky. The survey, titled “Outlook on Artificial Intelligence in the Enterprise 2016” was published by Narrative Science, a “data storytelling” company that uses natural language generation to turn data into narratives. Narrative Science had help from the National Business Research Institute, a survey company that did the data collection for them. The headline of the survey announcement seems to be that 38% of those surveyed are already using AI technologies, while 56% of those that aren’t expect to do so by 2018. But, if that’s the case, then my math says that 73% of respondents’ organizations expect to have AI deployed by 2018, but the official report cites this number as 62%. Also, an infographic published by the same group says that only 24% of organizations surveyed are currently using AI, instead of the 38% quoted in their news release. This discrepancy could be due to the fact that a large percentage of organizations represented by the survey had more than one respondent, but it’s very confusing and I’d certainly expect more from a “data storytelling” company. Unless of course their press release and infographic where totally created by a generative AI, in which case I’m very impressed but also a bit horrified. Of course, the articles reporting on the survey don’t do anything to clear this up, with one of them reporting that 74% of organizations have already adopted AI. In any case, I feel we do need more data about enterprise adoption of AI, so some credible numbers here would be great but for now this ends up being just a cautionary tale about questioning your data. I have tweeted out to the company for clarification, and I’ll share whatever I find out. Subscribe: iTunes / Youtube / Spotify / RSS
Last week, at a Machine Learning meetup at Stanford University, NVIDIA CEO Jen-Hsun Huang unveiled the company’s new flagship GPU, the NVIDIA TITAN X, and gifted the first device off of the assembly line to famed ML Researcher Andrew Ng. The new TITAN X, which holds the same name as the previous version of the device, is based on the company’s new Pascal graphics architecture, which was unveiled back in May. Last night, at a Machine Learning meetup at Stanford University, NVIDIA CEO Jen-Hsun Huang unveiled the company’s new flagship GPU, the NVIDIA TITAN X and gifted the first device off of the assembly line to famed ML Researcher Andrew Ng. The new TITAN X, which holds the same name as the previous version of the device, is based on the company’s new Pascal graphics architecture, which was unveiled back in May. The company is so excited about the card, it’s blog post introducing the card threw around a ton of superlatives and adjectives like Biggest, Ultimate, Irresponsible, Crazy, and Reckless. It also threw a bunch of numbers around, including these: 11 Trillion Floating point ops/sec 32-bit floating point 44 Trillion INT8 ops per second 12B transistors 3,584 CUDA cores running at 1.53 GHz 12 GB of GDDR5X memory with a 480 GB/s bandwidth) The other number it tossed out there was 1,200, which is the price of the card in US dollars. Now, not everyone is excited about this card as NVIDIA. Indeed, for gamers, what NVIDIA’s offering with the TITAN X is a GPU that’s about 25% faster than the company’s standby offering the GTX1080 but at double the cost. But it could be that that’s because the company is targeting deep learning researchers instead of gamers for the TITAN X. (In fact, CEO Jen-Hsun said as much at the product launch.) For people working on deep learning, the specs of the TITAN X should allow it to increase model training performance by 30-60%, which can save a researcher weeks of time and computing costs. The best technical preview I’ve found of the new card, which comes out on August 2nd, is over on AnandTech. Of course I’ll be dropping a link to this article and all the other ones I mention on the show into the show notes, available at twimlai.com.
This week’s show covers the International Conference on Machine Learning (ICML 2016), “dueling architectures” for reinforcement learning, AI safety goals for robots, plus top AI business deals, tech announcement, projects and more. ICML 2016 –Accepted Papers | ICML New York City – Which companies had accepted papers at #icml2016 ? Best Paper Awards – [1511.06581] Dueling Network Architectures for Deep Reinforcement Learning – [1601.06759] Pixel Recurrent Neural Networks – [1602.07415] Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling – My winner in the best name category: Extended and Unscented Kitchen Sinks – Demystifying Deep Reinforcement Learning Research Google Research Blog: Bringing Precision to the AI Safety Discussion OpenAI Blog: Concrete AI safety problems Paper: 1606.06565.pdf OpenAI technical goals Artificial intelligence achieves near-human performance in diagnosing breast cancer — ScienceDaily Paper: 1606.05718.pdf Business Twitter pays up to $150M for Magic Pony Technology, which uses neural networks to improve images | TechCrunch Increasing our Investment in Machine Learning | Twitter Blogs Artificial Intelligence Explodes: New Deal Activity Record For AI DARPA is looking to make huge strides in machine learning | PCWorld Data-Driven Discovery of Models (D3M) – Federal Business Opportunities: Opportunities AI Culture Wars in Silicon Valley How Siri Started — and Lost — the Assistant Race How Google is Remaking Itself as a “Machine Learning First” Company — Backchannel AI, Apple and Google Technology Lighting the way to deep machine learning | Engineering Blog | Facebook Code Intel Launches ‘Knights Landing’ Phi Family for HPC, Machine Learning The Toronto Raptors Are Using IBM’s Watson to Draft A Winning Team | Motherboard Projects Hello, TensorFlow! How to read: Character level deep learning GitXiv: Collaborative Open Computer Science Machine Learning Yearning Mastering Feature Engineering – O’Reilly Media Bonus I didn’t have time to cover: The Stanford Question Answering Dataset
This week’s podcast looks at new research on intrinsic motivation for AI systems, a kill-switch for intelligent agents, “knu” chips for machine learning, a screenplay made by a neural net, and more. Here are the notes for this week’s show: Intrinsically Motivated AI Playing Montezuma’s Revenge with Intrinsic Motivation Unifying Count-Based Exploration and Intrinsic Motivation Intrinsically Motivated Machines Implementation of DEvelopmentAl Learning Safely Interruptible Agents What if robots decide they want to take control? New paper: “Safely interruptible agents” Safely Interruptible Agents Open Source Project Updates TensorFlow 0.9 Apache Spark 2.0 Preview: Machine Learning Model Persistence A “Knu” Chip for Machine Learning Former NASA Exec Brings Stealth Machine Learning Chip to Light CrowdFlower’s AI Push Solving Million (not Billion) Dollar Business Problems with AI Vi: An AI Personal Trainer Meet Vi Recurrent Neural Net Writes Sci-Fi Movie Movie Written by Algorithm Turns out to be Hilarious and Intense Adventures in Narrated Reality, Part II Understanding LSTMs The Unreasonable Effectiveness of Recurrent Neural Networks Teaching Robots to Feel Teaching Robots to Feel: Emoji & Deep Learning ML for Hackers: Build a Chatbot ML for Hackers: Build a Chatbot Siraj Raval on Twitter Image Credit: LifeBEAM
This week’s show looks at Facebooks’ new DeepText engine, creating art with deep learning and Google Magenta, how to build artificial assistants and bots, and applying economics to machine learning models. Here are the notes for this week’s show: DeepText: Facebook’s Text Understanding Engine Introducting DeepText: Facebook’s Text Understanding Engine FBLearner Flow Research: Text Understanding from Scratch Natural Language Processing (almost) from Scratch Machine Learning and Art Google Magenta Neural Art A Neural Algorithm of Artistic Style Neural Art in TensorFlow Autoencoding Blade Runner Courses: NYU’s Machine Learning for Artists Goldsmith’s University of London The Latest TensorFlow Paper TensorFlow: A system for large-scale machine learning Business of ML & AI Microsoft Confirms Microsoft Ventures VC Arm Intel Acquires Computer Vision for IOT, Automotive Lumiata Closes $10 Million Series B Financing with Intel Capital Findo raises $3M to help you find files and documents through natural language queries More Bots, and How to Build Artificial Assistants Motion AI lets anyone easily build a bot Sequel lets you create a ‘Me’ bot, beats Google to the punch Hybrid Intelligence: How Artificial Assistants Work The Economics of Machine Learning models The preoccupation with test error in applied machine learning Towards Cost-Optimized Artificial Intelligence More Cool Deep Learning posts Deep Reinforcement Learning: Pong from Pixels A Survey of Deep Learning Techniques Applied to Trading Just for Fun Building an IoT Magic Mirror Magic Mirror on GitHub Image Credit: Microsoft
This week’s episode explores the White House workshops on AI, human bias in AI and machine learning models, a company working on machine learning for small datasets, plus the latest AI & ML news and a self-driving car that learned how to drive aggressively. Here are the notes for this week’s stories: Martin Ford and the Rise of the robots Martin Ford on Twitter The Rise of the Robots on Amazon The White House Takes on AI The White House Workshops on AI: Preparing for the Future of Artificial Intelligence Legal and Governance Implications of Artificial Intelligence: Workshop video on Youtube MIT Technology Review article Geekwire article Teaching Machines Human Biases It’s Too Late—We’ve Already Taught AI to Be Racist and Sexist Machine Bias This Self-Driving Car Learned to Drive Like a New Yorker Autonomous Mini Rally Car Teaches Itself to Powerslide From the Rumor Mill Amazon to Battle Google With New Cloud Service for AI Software Apple may open up Siri to developers. That’s a huge deal. Source: New Apple TV will compete with Amazon Echo Machine Learning Models with Less Data Algorithms That Learn with Less Data Could Expand AI’s Power Geometric Intelligence Bonus for Devs: Churn prediction with Spark Churn Prediction with PySpark using MLlib and ML Packages Didn’t Get to This But You Should Know O’Reilly Announces New AI Conference
Every week I end the week with close to 100 tabs filled with stories—some good, some not so good—spanning all corners of the cloud computing, big data, machine learning and AI web. I thought it would be useful to bring you the best of these stories in a weekly podcast. I have no idea whether this will be sustainable or not—this first episode took a lot of work—but let’s run with it and see what happens. Here are the notes for this week’s stories: AI Tech Front and Center at Google I/O Google I/O 2016 Keynote I/O: Building the next evolution of Google Google supercharges machine learning tasks with TPU custom chip Nvidia creates a 15B-transistor chip for deep learning Deep Learning Part of Amazon’s Destiny Amazon open-sources its own deep learning software, DSSTNE https://github.com/amznlabs/amazon-dsstne TensorFlow Uber’s Autonomous Boom Box Takes to the Streets of Pittsburgh Steel City’s New Wheels Jeff Schneider Interview at Structure Data Artificial Intelligence for Robotics: Programming a Robotic Car Scanse’s Sweep: Scanning LIDAR for Everyone AI by the Bay Conference AI by the Bay / Data by the Bay What’s Up with NLP at Quora? Applications of NLP at Quora Conference Calls: AI’s Killer App? How this guy used Watson to tune out of conference calls https://github.com/joshnewlan/say_what