We're hiring!
*

Transforming speech technology with WhisperLive

Kara Bembridge avatar

Kara Bembridge
May 28, 2024

Share this post:

Reading time:

The world of AI has made leaps and bounds from what it once was, but there are still some adjustments required for the optimal outcome. In the realm of conversational AI, VoxAI had already developed a platform to capture customer orders. The response time and oratory abilities needed improvement and this is where Collabora stepped in with WhisperLive.

WhisperLive, a real-time transcription service powered by OpenAI's Whisper model departed from traditional speech recognition methods by incorporating voice activity detection (VAD). VAD identifies speech presence, allowing for selective transmission of audio data to enhance transcription accuracy while optimizing data handling.

Simultaneously, Collabora employed a finely tuned Mistral model for the NLP component. Renowned for its efficiency and versatility, Mistral is six times faster and equally or more effective than the Llama 2 70B model across benchmarks. This model supports multiple languages and possesses inherent coding capabilities.

As we look to the WhisperLive's promising possibilities, our Machine Learning Lead, Marcus Edel, puts it best:

"The future of customer interaction lies in the harmonious fusion of sophisticated AI and powerful communication technologies. As we continue our mission and build fully in the open, WhisperLive, and now the award-nominated WhisperFusion, are poised to make an impact in the communication technology landscape."

To learn more about how this project came to life, take a look at out our case study.

If you're eager to implement your own transcription service, please get in touch! Our machine learning team is ready to assist you with your AI needs.

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

Writing a Rust GPU kernel driver: a brief introduction on how GPU drivers work

06/08/2025

This second post in the Tyr series dives deeper into GPU driver internals by using the Vulkan-based VkCube application to explain how User…

A practical debugging guide for media driver developers

22/07/2025

Getting into kernel development can be daunting. There are layers upon layers of knowledge to master, but no clear roadmap, especially when…

Quick notes from the GStreamer Spring Hackfest 2025

15/07/2025

This past May, we met with the community at the GStreamer Spring Hackfest in Nice, France, and were able to make great strides, including…

PipeWire workshop 2025: Updates on video transport, Rust efforts, TSN networking, and Bluetooth support

03/07/2025

As part of the activities Embedded Recipes in Nice, France, Collabora hosted a PipeWire workshop/hackfest, an opportunity for attendees…

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Open Since 2005 logo

Our website only uses a strictly necessary session cookie provided by our CMS system. To find out more please follow this link.

Collabora Limited © 2005-2025. All rights reserved. Privacy Notice. Sitemap.