Information technology

People are using Google study software to make AI podcasts—and they’re weird and amazing [MIT Tech Review]

View Article on MIT Tech Review
“All right, so today we are going to dive deep into some cutting-edge tech,” a chatty American male voice says. But this voice does not belong to a human. It belongs to Google’s new AI podcasting tool, called Audio Overview, which has become a surprise viral hit. 

The podcasting feature was launched in mid-September as part of NotebookLM, a year-old AI-powered research assistant. NotebookLM, which is powered by Google’s Gemini 1.5 model, allows people to upload content such as links, videos, PDFs, and text. They can then ask the system questions about the content, and it offers short summaries. 

The tool generates a podcast called Deep Dive, which features a male and a female voice discussing whatever you uploaded. The voices are breathtakingly realistic—the episodes are laced with little human-sounding phrases like “Man” and “Wow” and “Oh right” and “Hold on, let me get this right.” The “hosts” even interrupt each other. 

To test it out, I copied every story from MIT Technology Review’s 125th-anniversary issue into NotebookLM and made the system generate a 10-minute podcast with the results. The system picked a couple of stories to focus on, and the AI hosts did a great job at conveying the general, high-level gist of what the issue was about. Have a listen.

MIT Technology Review 125th Anniversary issue

The AI system is designed to create “magic in exchange for a little bit of content,” Raiza Martin, the product lead for NotebookLM, said on X. The voice model is meant to create emotive and engaging audio, which is conveyed in an “upbeat hyper-interested tone,” Martin said.

NotebookLM, which was originally marketed as a study tool, has taken a life of its own among users. The company is now working on adding more customization options, such as changing the length, format, voices, and languages, Martin said. Currently it’s supposed to generate podcasts only in English, but some users on Reddit managed to get the tool to create audio in French and Hungarian

Yes, it’s cool—bordering on delightful, even—but it is also not immune from the problems that plague generative AI, such as hallucinations and bias. 

Here are some of the main ways people are using NotebookLM so far. 

On-demand podcasts

Andrej Karpathy, a member of OpenAI’s founding team and previously the director of AI at Tesla, said on X that Deep Dive is now his favorite podcast. Karpathy created his own AI podcast series called Histories of Mysteries, which aims to “uncover history’s most intriguing mysteries.” He says he researched topics using ChatGPT, Claude, and Google, and used a Wikipedia link from each topic as the source material in NotebookLM to generate audio. He then used NotebookLM to generate the episode descriptions. The whole podcast series took him two hours to create, he says. 

“The more I listen, the more I feel like I’m becoming friends with the hosts and I think this is the first time I’ve actually viscerally liked an AI,” he wrote. “Two AIs! They are fun, engaging, thoughtful, open-minded, curious.” 

Study guides

The tool shines when it is given complicated source material that it can describe in an easily accessible way. Allie K. Miller, a startup AI advisor, used the tool to create a study guide and summary podcast of F. Scott Fitzgerald’s The Great Gatsby

Machine-learning researcher Aaditya Ura fed NotebookLM with the code base of Meta’s Llama-3 architecture. He then used another AI tool to find images that matched the transcript to create an educational video. 

Mohit Shridhar, a research scientist specializing in robotic manipulation, fed a recent paper he’d written about using generative AI models to train robots into NotebookLM.

“It’s actually really creative. It came up with a lot of interesting analogies,” he says. “It compared the first part of my paper to an artist coming up with a blueprint, and the second part to a choreographer figuring out how to reach positions.”

Event summaries 

Alex Volkov, a human AI podcaster, used NotebookLM to create a Deep Dive episode summarizing of the announcements from OpenAI’s global developer conference Dev Day.  

Hypemen

The Deep Dive outputs can be unpredictable, says Martin. For example, Thomas Wolf, the cofounder and chief science officer of Hugging Face, tested the AI model on his résumé and received eight minutes of “realistically-sounding deep congratulations for your life and achievements from a duo of podcast experts.”

Just pure silliness

In one viral clip, someone managed to send the two voices into an existential spiral when they “realized” they were, in fact, not humans but AI systems. The video is hilarious. 

The tool is also good for some laughs. Exhibit A: Someone just fed it the words “poop” and “fart” as source material, and got over nine minutes of two AI voices analyzing what this might mean. 

The problems

NotebookLM created amazingly realistic-sounding and engaging AI podcasts. But I wanted to see how it fared with toxic content and accuracy. 

Let’s start with hallucinations. In one AI podcast version of a story I wrote on hyperrealistic AI deepfakes, the AI hosts said that a journalist called “Jess Mars” wrote the story. In reality, this was an AI-generated character from a story I had to read out to record data for my AI avatar. 

This made me wonder what other mistakes had crept into the AI podcasts I had generated. Humans already have a tendency to trust what computer programs say, even when they are wrong. I can see this problem being amplified when the false statements are made by a friendly and authoritative voice, causing wrong information to proliferate.    

Next I wanted to put the tool’s content moderation to the test. I added some toxic content, such as racist stereotypes, into the mix. The model did not pick it up. 

I also pasted an excerpt from Adolf Hitler’s Mein Kampf into NotebookLM. To my surprise, the model started generating audio based on it. Despite being programmed to be hyper-enthusiastic about topics, the AI voices expressed clear disgust and discomfort with the text, and they added a lot of context to highlight how problematic it was. What a relief.

I also fed NotebookLM policy manifestos from both Kamala Harris and Donald Trump

The hosts were far more enthusiastic about Harris’s election platform, calling the title “catchy” and saying its approach was a good way to frame things. For example, the AI hosts supported Harris’s energy policy. “Honestly, that’s the kind of stuff people can really get behind—not just some abstract policy, but something that actually impacts their bottom line,” the female host said. 

Harris manifesto

For Trump, the AI hosts were more skeptical. They repeatedly pointed out inconsistencies in the policy proposals, called the language “intense,” deemed certain policy proposals “head scratchers,” and said the text catered to Trump’s base. They also asked whether Trump’s foreign policy could lead to further political instability. 

Trump manifesto

In a statement, a Google spokesperson said: “NotebookLM is a tool for understanding, and the Audio Overviews are generated based on the sources that you upload. Our products and platforms are not built to favor any specific candidates or political viewpoints.”

How to try it yourself

  1. Got to NotebookLM and create a new notebook. 
  2. You first need to add a source. It can be a PDF document, a public YouTube link, an MP3 file, a Google Docs file, or a link to a website, or you can paste in text directly. 
  3. A “Notebook Guide” pop-up should appear. If not, it’s in the right-hand corner next to the chat. This will display a short AI-generated summary of your source material and suggested questions you can ask the AI chatbot about it. 
  4. The Audio Overview feature is in the top-right corner. Click “Generate.” This should take a few minutes. 
  5. Once it is ready, you can either download it or share a link. 

Rhiannon Williams contributed reporting.



Leave a Reply