OpenAI offers new text-to-speech and speech-to-text models in the API. These are designed to outperform Whisper.
According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3 and Deepgram Nova-3 in accurately converting spoken speech into text on the web ...
Despite boasting 680,000 hours of audio data training, Whisper sometimes "hallucinates ... these hallucination problems to ensure speech-to-text systems are reliable and safe, particularly ...