Be a part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra
Gladia, an AI transcription and audio intelligence supplier, has raised $16 million in funding.
The Paris, France-based firm will use the funding to develop an end-to-end audio infrastructure – beginning with a brand new real-time audio transcription and analytics engine – enabling voice-first platforms to ship extra worth to their customers throughout borders with cutting-edge AI.
It’s a problem to rivals resembling Otter.ai and Fireflies.ai, in addition to different AI-based companies that transcribe voice conversations to textual content. In an interview with VentureBeat, CEO Jean-Louis Quéguiner defined to me why he began the corporate.
“As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner stated. “That’s why I founded the company.”
I bought a demo of the AI transcription, and it labored in actual time as Quéguiner spoke English together with his heavy French accent. I’m used to companies like Otter getting plenty of phrases improper in a transcription, however within the first web page of outcomes from Gladia, I noticed no errors. He additionally confirmed how he might communicate two totally different languages and the system might shift from one language to a different as wanted.
XAnge led the spherical, with participation by Illuminate Monetary, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital.
Based in 2022, Gladia has now raised a complete of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as a part of the First Sequoia Arc program), Cocoa, and GFC. Gladia not too long ago was chosen to take part within the AWS generative AI accelerator program.
“Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” stated Alexis du Peloux, associate at XAnge, in a press release. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.”
Given that the majority speech recognition fashions at the moment are skilled predominantly on English audio knowledge and are subsequently inherently biased, Gladia prioritized constructing the primary real-time product that’s actually multilingual.
The brand new fine-tuned engine delivers superior real-time transcription in over 100 languages, together with enhanced assist for accents and the distinctive skill to adapt to totally different languages on the fly.
Gladia’s new engine is exclusive in its skill to extract insights from a name—just like the caller’s sentiment, key data, and dialog abstract—in real-time. This implies it takes lower than a second to generate each transcript and insights from a name or assembly utilizing Gladia.
New real-time AI transcription
Constructing an correct, low-latency, and multilingual engine in-house is a fancy and resource-intensive activity. It requires in depth experience in language understanding, real-time knowledge dealing with, with steady optimization and upkeep. Actual-time fashions require extra computing energy and will battle to provide correct output instantly as a result of restricted context.
Gladia’s new product permits corporations to bypass these challenges. The true-time speech-to-text engine boasts an industry-leading latency of underneath 300 milliseconds with out compromising accuracy, whatever the language, geography, or tech stack used.
“Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” stated Jonathan Soto, CTO of Gladia, in a press release. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.”
What’s forward
The corporate’s first async transcription and audio intelligence API launched in June 2023 and was primarily based on a proprietary model of Whisper ASR.
It quickly gained traction within the enterprise market, significantly with assembly recorders and note-taking assistants. The API is now adopted by over 600 clients around the globe, together with Consideration, Circleback, Technique Monetary, Recall, Sana, and VEED.IO and has greater than 70,000 customers.
“Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner stated. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.”
Gladia will use the brand new capital to advance its R&D efforts and shortly deliver to market a one-stop AI toolkit for audio and increase its product providing with further à la carte fashions—together with giant language fashions (LLMs) and retrieval-augmented era (RAG). With a number of design companions within the contact-center-as-a-service (CCaaS) section, the corporate is at the moment piloting an agent-assist resolution powered by Gladia’s real-time AI engine. Moreover, Gladia will proceed to increase its expertise base because it prepares for worldwide enlargement.
“We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner stated. “You can start with the language and switch to another.”
He went on to indicate me that he might begin a name in English and provoke the transcription. Then he spoke French phrases, and the mannequin appropriately translated it in French.
“Keep in mind that [others] are not real time right now, and this one is real time,” he stated. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.”
The service has an AI summarizer, and it’ll have new elective options within the coming months. Quéguiner stated that his service also can get acronyms proper and detect the swap to a different language.
“The mannequin we use is similar to LLMs (giant language fashions). It has no code decoder structure, which isn’t the case for many of the fashions that you simply’ve seen with Fireflies, for example.
The market consists of “meeting recorders,” Quéguiner stated. The outcomes may be handed on to real-time insights, which will help folks like gross sales leads shut offers sooner.
The corporate additionally works with Name Facilities, giving them 30% sooner time to completion when they’re on the cellphone thanks to raised accuracy. The corporate will cost a flat payment resembling a per-hour pricing.