Meta has launched an “open” implementation of the viral generate-a-podcast function in Google’s NotebookLM.
Referred to as NotebookLlama, the mission makes use of Meta’s personal Llama fashions for a lot of the processing, unsurprisingly. Like NotebookLM, it will probably generate back-and-forth, podcast-style digests of textual content information uploaded to it.
NotebookLlama first creates a transcript from a file — e.g. a PDF of a information article or weblog put up. Then, it provides “more dramatization” and interruptions earlier than feeding the transcript to open text-to-speech fashions.
The outcomes don’t sound practically nearly as good as NotebookLM. Within the NotebookLlama samples I’ve listened to, the voices have a really clearly robotic high quality to them, and have a tendency to speak over one another at odd factors.
However the Meta researchers behind the mission say that the standard might be improved with stronger fashions.
“The text-to-speech model is the limitation of how natural this will sound,” they wrote on NotebookLlama’s GitHub web page. “[Also,] another approach of writing the podcast would be having two agents debate the topic of interest and write the podcast outline. Right now we use a single model to write the podcast outline.”
NotebookLlama isn’t the primary try to duplicate NotebookLM’s podcast function. Some tasks have had extra success than others. However none — not even NotebookLM itself — have managed to resolve the hallucination downside that canine all AI. That’s to say, AI-generated podcasts are certain to include some made-up stuff.