OpenAI Amps Up ChatGPT’s Voice Features, for a Select Few [CNET]

View Article on CNET

Access to advanced voice-enabled functionality will expand throughout 2024.

Lisa joined CNET after more than 20 years as a reporter and editor. Career highlights include a 2020 story about problematic brand mascots, which preceded historic name changes, and going viral in 2021 after daring to ask, “Why are cans of cranberry sauce labeled upside-down?” She has interviewed celebrities like Serena Williams, Brian Cox and Tracee Ellis Ross. Anna Kendrick said her name sounds like a character from Beverly Hills, 90210. Rick Astley asked if she knew what Rickrolling was. She lives outside Atlanta with her son, two golden retrievers and two cats.

Expertise Technology | AI | Advertising | Retail

AI company OpenAI is beginning to roll out new voice features for its ChatGPT chatbot to a small number of ChatGPT Plus subscribers in an early alpha trial, it said on X on Tuesday.

The startup previewed advanced voice mode during its Spring Update in May, which is where it also debuted its GPT-4o model.

AI Atlas art badge tag

Users with access have jumped on social media to share their initial experiences, which include getting help with French pronunciations, mimicking an airline pilot speaking from the cockpit and imitating seven US regional dialects. The New York and Midwestern accents could use a little work, but the chatbot knows that New Yorkers fold their pizza.

OpenAI isn’t alone in its ambitions for chatbot voice functionality for subscribers who pay $20 per month for perks like early access. Google, too, shared its plans for a more conversational Gemini chatbot via its Gemini Live feature for Gemini Advanced subscribers, who also pay $20 per month. Meta’s Meta AI chatbot can also chat with users who are wearing its Ray-Ban glasses

This is one example of how technology companies continue to roll out new models and features in an appeal to users that is also an ongoing game of one-upmanship. The prize? The biggest share of the generative AI market, which is projected to be worth $1.3 trillion by 2023. 

Hey, ChatGPT

According to OpenAI, advanced voice mode allows you to have more natural real time conversations with ChatGPT. It also senses and responds to your emotions — and you can interrupt if you want.

You can call up ChatGPT with a familiar phrase: “Hey, ChatGPT.”

Beyond that, details about what exactly this advanced functionality includes are unclear. A spokesperson didn’t respond to a request for comment.

Subscribers in the alpha test will receive a notice in the ChatGPT app, along with an email with instructions about how to use it. The goal of the early trial is to monitor usage and improve the model’s capabilities and safety prior to wider rollout, a spokesperson said in an earlier email.

Signup notice for AI Atlas newsletter

OpenAI will expand access to additional subscribers over the next few weeks and plans to offer advanced voice functionality to all Plus members in the fall. In addition to early access to new features, Plus members also receive an always-on connection and unlimited access to GPT-4o. (If you use the free version, you’ll be bumped down to the earlier GPT-3.5 model if you ask too many questions or if traffic is high.)

ChatGPT first introduced voice functionality in September 2023.

Advanced voice mode will include four preset voices, Breeze, Cove, Ember and Juniper, which OpenAI developed with voice actors in 2023. There was originally a fifth voice, Sky, but it was paused after actor Scarlett Johansson, who played the voice of the virtual assistant Samantha in the 2013 movie Her, complained about similarities to her own voice.

CEO Sam Altman released a statement apologizing to Johansson but said the voice wasn’t meant to resemble hers.

In a related blog post, OpenAI said it picked the voice actors for its voices based on finding talent from diverse backgrounds, as well as voices that feel timeless, voices that are approachable and trustworthy, voices that are warm, engaging and charismatic, and voices that are natural and easy to listen to.

OpenAI said ChatGPT can’t impersonate voices, and it has added filters that will block requests to generate copyrighted audio.

Other Services & Software