Self-Driving Car Startup Comma.ai Releases Video and Sensor Dataset

899 474 The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

This post is an excerpt from the August 5, 2016 edition of the This Week in Machine Learning & AI podcast. You can listen or subscribe to the podcast below.

Autonomous driving startup Comma.ai released a small dataset that lets you try your hand at building your own models for controlling a self-driving vehicle. The dataset consists 10 video clips recorded at 20 Hz from a camera mounted on the windshield of a 2016 Acura ILX.

There are about 7 hours of video total, captured mostly during highway driving. Alongside the video files are a set of sensor logs where measurements such as velocity, acceleration, steering angle, GPS location and gyroscope angles are recorded.

The dataset is a 45 GB compressed zip file that explodes to 80 GB when compressed. That is, if you can get it to uncompress. When I tried it, after a fairly long download, unzip complained about the file being corrupt when I tried to unzip it.

The project’s github repo includes a script to download the data from archive.org as well as some simple models built in Keras and TensorFlow for predicting steering angle and creating simulated road images using generative AI.

They’ve also included a paper on the latter topic. The idea is that since it’s pretty expensive to train a self-driving car on real roads, you typically want to train your algorithms in a simulator. To do that, you can either hand code a simulator or use a generative AI to create one. The paper describes the use of variational autoencoders and generative adversarial networks and an RNN to create simulated road images.

You can start by running their existing models, but if you manage to do amazing things with the data, let Comma know—they’re hiring and want to meet you.

Image: Comma.ai

1 comment
  • Connie
    REPLY

    I recently discovered your podcast and I’m loving it! I like that you cover AI/ML from different angles, the technological and research side, the business side, the ethics side, etc. Being a total outsider to the industry and having minimal knowledge on computer science, I also appreciate very much how you explain difficult technical concepts in layman’s language. This is such an exciting area of tech development and there’s so much happening everyday. Your podcast and blog are a valuable resource to keep those who are interested updated. Thank you!

Leave a Reply

Your email address will not be published.