Conducts a voice dialog like a real person: OpenAI introduces ChatGPT-4o

Kyiv • UNN

May 14 2024, 01:07 AM • 28307 views

OpenAI introduces ChatGPT-4o, a new speech model that can conduct real-time voice conversations with an average response time of 320 milliseconds, just like a real person.

Conducts a voice dialog like a real person: OpenAI introduces ChatGPT-4o

OpenAI has introduced a new language model ChatGPT-4o that works with audio, images, and texts in real time. The company announced this in its blog, UNN reports.

Details

Prior to GPT-4o, voice conversations with ChatGPT had an average delay of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4). The new model has improved these figures to an average of 320 milliseconds, which corresponds to the reaction of a live person.

OpenAI hopes that this product will be a step towards a more natural interaction between the user and the computer. GPT-4o can also act as a fast voice translator between interlocutors speaking different languages.

Addendum Addendum

Voice mode works through the synergy of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes the text and outputs text, and a third simple model converts this text back to audio. In addition, compared to existing language models, GPT-4o is better at understanding images and audio.

The new technology will be introduced gradually over the coming weeks. Separately, the company will present a PC application with new features.

Unlike GPT-4 Turbo, this product is free, but paid users will have access to more features.

OpenAI is preparing a search product, challenging Google09.05.24, 19:16 • 23472 views

Lilia Podolyak

Technologies Audio news

OpenAI

ChatGPT