Fully Clickable Video Ad

ElevenLabs is launching its own speech-to-text model | TechCrunch

Spread the love


ElevenLabs, an AI startup that just raised a $180 million mega funding round, has been primarily known for its audio generation prowess. The company took a step in another technological direction by launching its first standalone speech-to-text model called Scribe.

The startup, valued at $3.3 billion, has aided many other companies in providing speech-to-text services through its vast library of voices. However, the company is now looking to get into speech detection and compete with the likes of Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI’s Whisper models.

ElevenLabs’ Scribe model supports over 99 languages at launch. The company categorizes over 25 languages in excellent accuracy category for the model where the word error rate is less than 5%. This list includes English (claimed accuracy rate of 97%), French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. Other languages are ranked in different categories with high (5-10% word error rate), good (10 to 20% word error rate), and moderate (25 to 50%) word error rates.

The company said that the model outperformed Google Gemini 2.0 Flash and Whisper Large V3 across multiple languages in FLEURS & Common Voice benchmark tests.

Blinking Photo Ad

ElevenLabs had developed the speech-to-text component for its AI conversational agent platform, which was released last year. However, this is the first time the company is releasing a standalone speech detection model. In a conversation with TechCrunch last month, CEO Mati Staniszewski talked about improving speech detection models.

“We want to understand what’s being said by you in a conversation better. We are working on ways to move away from only generating content and understanding and transcribing speech,” Staniszewski said at that time. “Many people say that speech-to-text is a solved problem. But for many languages, it is pretty bad. We think we can build better speech detection models because we have in-house teams to annotate data and give us quick feedback.”

See also  Microsoft finalizes its EU sovereign cloud project | TechCrunch

The model also has smart speaker diarization to tell you who is speaking, timestamp at word level for accurate subtitles, and auto-tagging sound events like audience laughters. The startup is providing a way for customers to directly transcribe video content to add subtitles or captions in its studio.

Scribe currently only works with pre-recorded audio formats. The company said it will release a low-latency real-time version of the model soon. That means it is not yet effective for meeting transcriptions or voice note-taking.

ElevenLabs is pricing Scribe at $0.40 for an hour of transcribed audio. While the rate is competitive, some of its rivals offer a lower price for audio transcriptions at the moment with some feature differentiation.

Related Posts
Kiren Rijiju: Why Earth Sciences minister Rijiju is upset with this European IT company | – Times of India

Earth Sciences Minister Kiren Rijiju is reportedly upset with the French IT company Atos. Reason is said to be Read more

Former Activision boss reportedly wants to buy TikTok – Times of India
Former Activision boss reportedly wants to buy TikTok - Times of India

Bobby Kotick, the former head of Activision Blizzard, is reportedly considering buying TikTok, as the app could be banned Read more

How Apple’s Find My app ‘cost’ a US city millions of dollars – Times of India
How Apple’s Find My app ‘cost’ a US city millions of dollars - Times of India

Apple's Find My app has cost the city of Denver, US $3.76 million in compensation and damages. In 2022, Read more

Moto G54 receives a price cut in India: Here’s how much the smartphone costs – Times of India
Moto G54 receives a price cut in India: Here’s how much the smartphone costs - Times of India

If you have been planing to purchase a budget smartphone, then you can consider buying the Moto G54. Launched Read more

See also  A huge trove of leaked Black Basta chat logs expose the ransomware gang’s key members and victims | TechCrunch

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top