It’s been an exciting summer on many fronts here at TWIML HQ. One of the things I’m most pumped up about is the work of our first intern, Malaika, a recent high school graduate (and my daughter ????). Malaika wrote a blog post introducing herself and her project, and just published her first project update. I’m including her latest here as a guest post.

Comparing Machine Transcription Services by Malaika Charrington

For those of you who missed my first article, I’m TWIML & AI’s intern, Malaika. This summer I’m working on a project in which I will compare the accuracy and features of several different machine transcription services and attempt to determine which of these we might use for the podcast. I will then familiarize myself with different semantic tagging services and techniques and attempt to automatically tag each podcast on the website to allow listeners to easily find the podcasts that cover their interests.

My first step in this project was to research 10 different transcription services: Deepgram Brain, Speechmatics, Microsoft Speech-to-text, Google Speech-to-text, Amazon Speech-to-text, IBM Watson Speech-to-text, VoiceBase Speech-to-text, Trint, Temi, and Sonix. I compiled my research and did a brief presentation on the results I found. These findings are shown in the comparison slideshow below [Ed. I’ve substituted a comparison chart she made here. Visit Malaika’s post for the slideshow.].

For the next step in the project, testing the services, I narrowed my research down to 6 services, Deepgram, Google Transcribe, Microsoft Azure Transcribe, Amazon Transcribe, IBM Watson Transcribe, and Sonix. Next I gathered ~20 unique clips from the podcast, some with more technical language, some with guests who spoke with accents, some with several guests. In my next post I’ll share the results of running these clips through each of the transcription services, testing the word and speaker label accuracy and analysing the efficiency of each service’s unique features in getting an accurate transcription.

I’ve learned a few interesting things so far in my project, most notably how to call APIs and how to do some simple commands in my computer terminal and I look forward to doing more with that.

Special thanks to Scott Stephenson from Deepgram for helping us refine our game plan for testing our audio clips with different services! We appreciate all of your tips and advice and I’m confident that our new plan will help us get great results! Also, a shout out to Jamie Sutherland from Sonix for gifting us free transcription minutes to transcribe our test audio clips!

I’m very excited about the progress that I’ve made so far in the project and I’ll continue to keep you all updated!

I encourage you to follow Malaika or TWIML over on Medium to get updates on her progress.

Sign up for our Newsletter to receive this weekly to your inbox.