Member-only story

OpenAI’s New Super Model: Whisper Achieves Human Level Performance in Speech Recognition

The new model combines zero-shot-learning with supervised fine tuning to match human level performance across different speech recognition tasks.

Jesus Rodriguez
3 min readSep 27, 2022
Image Source: https://huggingface.co/spaces/openai/whisper

I recently started an AI-focused educational newsletter, that already has over 125,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

Automatic speech recognition(ASR) is one of the deep learning disciplines that have seen a tremendous level of innovation in the last few years. Part of this innovation catalyst have come from the emergence of unsupervised pretraining techniques which are able to learn from raw audio without explicitly dependencies on human labelers. Wav2Vec is the canonical example of this type of unsupervised method. Despite the progress, ASR…

--

--

Jesus Rodriguez
Jesus Rodriguez

Written by Jesus Rodriguez

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...

No responses yet