Imagine spending years learning ML from the ground up, from its theoretical foundations, but still feeling like you didn’t really know how to apply it. That’s where David Odaibo found himself in 2015, after the second year of his PhD. David’s solution was Kaggle, a popular platform for data science competitions.
Fast forward four years, and David is now a Kaggle Grandmaster, the highest designation, with particular accomplishment in computer vision competitions. Having completed his degree last year, he is currently co-founder and CTO of Analytical AI, a company that grew out of one of his recent Kaggle successes.
David has a background in deep learning and medical imaging–something he shares with his brother, Stephen Odaibo, who we interviewed last year about his work in Retinal Image Generation for Disease Discovery.
Subscribe: iTunes / Google Play / Spotify / RSS
David’s initial forays into machine learning were a bit rocky, like when he built his first neural network in 2012 with C# which “worked horribly…It was the absolute wrong thing to do. I just didn’t know where to start back then.” So he put some time into learning more about frameworks and later realized there was a practical application of his efforts through Kaggle.
He didn’t actually start competing until a year after creating his account. His performance at his first competition in 2016 was not great, but he identified the right tools to use and ultimately gained some confidence.
First Success: U-Net Architectures & Augmentation Strategies
David’s first success at Kaggle–which remains his proudest accomplishment–came with his second attempt, where he got to apply his medical imaging and deep learning background in the Ultrasound Nerve Segmentation competition.
His secret sauce was a ground-up implementation of the U-net architecture (an encoder-decoder network), which hadn’t been used in a Kaggle competition. To “preserve the localization information from the original images…I implemented [the U-net architecture]. I trained the network, and to my surprise, it worked.”
He was not the only one to use a U-net in this competition, but he thinks his competitive edge came from being one of the only ones to learn how to build it from scratch. “If you did something different or you implemented your own and improved it a little bit, maybe more than what everybody else was using, you had a chance of doing a little better…”
David’s solution for this competition also benefited from a lot of experimenting with data augmentation strategies. “When you train segmentation networks to avoid overfitting, you have to augment the image and the mask together and you have to find the right kind of augmentation strategies…so I hacked those and created a good augmentation strategy [and] framework.”
David ultimately finished second out of 950 teams in the competition that year. Not bad for his second effort!
ProVision Body Scanners and Architectures for 3-Dimensional Data
Encouraged, David continued to participate in Kaggle. Following his early success was a competition focused on classifying images from airport body scanners (a.k.a. Nude-O-Scopes as Sam calls them), and was sponsored by the Department of Homeland Security. The goal was to create new algorithms that could more accurately predict threats and detect prohibited items during the screening process. It was the largest Kaggle competition in terms of prize money ($1.5 million) and also in terms of the size of the data set being used.
The Passenger Screening Algorithm Challenge was particularly interesting to David in its use of three-dimensional data. There were no existing best practices for how to build architectures that could process 3D data without downsizing or downsampling. Three-dimensional images require much more memory and storage to process than 2D images, but also create new opportunities. The third dimension provides a “third axis where you also can correlate features across multiple two-dimensional images because the volume is essentially a stack of two-dimensional images.”
David was still in the middle of his PhD research and had already been thinking about three-dimensional data for CT or MRI images. His entry for the competition would apply the same architecture that he had been using to try to detect Parkinson’s disease from brain scans.
His method involved dimensionality reduction by combining a 2D convolutional neural network (CNN) with a Long Short-Term Memory (LSTM) architecture that models sequences of data. Essentially, the CNN learned two-dimensional vectors from each of the images, and fed them into the LSTM, which could take advantage of the relationships between the frames. This allowed the team to avoid reducing the resolution of the input images.
Sam posits that “A lot of winning the competition is being on the winning side of information asymmetry.” It can be hard to gauge where you might stand in the competition to make sure your efforts are worth it. David tries to plan ahead by recognizing signs of promise from the beginning, which for him means placing in the top 30 on the leaderboard as a result of his initial efforts in a problem.
Distracted Drivers and Data Augmentation
Building on challenges with processing image data, another Kaggle competition David participated in was the State Farm Distracted Driver Detection challenge. The problem was to identify distracted drivers by reviewing images to determine whether the driver was doing things like playing with the radio, using the phone or applying makeup.
Their unique approach was to implement a creative data augmentation technique to train the model. The technique involved taking for example, two images of a driver playing with the radio. They would then vertically or horizontally combine, for example, 75% of one image with 25% of the other image to get an additional image. By joining them together, the partial image would indicate a distracted driver and should be enough to make the best prediction.
Combining images in this way was a solution to avoid overfitting. The data they were dealing with only had a few examples of the distracted driving behavior they were trying to identify in the training set, causing their neural networks to tend to overfit. (That is, they memorized the few examples they found in the training set, and struggled to generalize to an unseen validation set.) By combining the images they both created additional training examples and broke the networks tendency to rely on spurious patterns in identifying examples of distracted driving.
Otherwise, though, they used an off-the-shelf model architecture, demonstrating that it’s not always a unique architecture that wins the competition. According to David, you don’t necessarily need a massive ensemble of models to win Kaggle competitions either:
“If you focus on one model, you can almost do as well as a massive ensemble, but oftentimes the ensemble is the easy way out. But the ensembles, there’s a cost associated with that, at least for computer vision, in terms of GPU time. If you have infinite compute resources, you might be able to get away with ensembling, but oftentimes you have to weigh the cost of training many models with focusing on one and trying to get it as good as possible.”
“These are some of my secrets.”
David has a few tricks to share that apply to everyone, beginners and experts alike:
Keep it Simple. “The key…is that these solutions are usually simple…there’s this idea that starting Kaggle [is] hard…I feel like a lot of challenges you just have to look at it with a creative approach and just opening your mind that the solution is simple.”
Persistence. David also emphasizes that “Kaggle can be discouraging.” But you have to believe you can do well and give it a shot regardless, even if you don’t do well initially.
Reading Top Solutions. Another trick is to read the approaches from Kaggle winners so you can compare their solutions with your own to learn from what you could have improved.
In addition to the tips David emphasized above, here are a few additional suggestions we gleaned from the interview:
Kaggle Discussion Forums. Digging into forums to see what other people are doing is a great way to learn what angles and perspectives others are using that might help you approach the challenge.
Teaming Up. In most of his competitions, David has teamed up with others who all bring unique perspectives and help with the challenges.
Kernels. Kaggle is collaborative and kernels might be a great place to get you started, but it’s also a competition and as David puts it simply, “if you do what everybody else does, you’re not going to win.”
If you’re interested in joining Kaggle, or want to be part of a supportive community of folks working on Kaggle projects together, check our Kaggle study group! The group hosts virtual meetups every week. Learn more at http://twimlai.com/program/kaggle-team/.
Connect with David!
- Analytical AI
- Paper: U-Net: Convolutional Networks for Biomedical Image Segmentation
- #284 – Retinal Image Generation for Disease Discovery with Stephen Odaibo
- Join the TWIML Community!
- Check out our TWIML Presents: series page!
- Register for the TWIML Newsletter
- Check out the official TWIMLcon:AI Platform video packages here!
- Download our latest eBook, The Definitive Guide to AI Platforms!
“More On That Later” by Lee Rosevere licensed under CC By 4.0