Conducts a voice dialog like a real person: OpenAI introduces ChatGPT-4o

May 14, 2024 at 01:07 AM • 28344 переглядiв

OpenAI has introduced a new language model ChatGPT-4o that works with audio, images, and texts in real time. The company announced this in its blog, UNN reports.

Details

Prior to GPT-4o, voice conversations with ChatGPT had an average delay of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4). The new model has improved these figures to an average of 320 milliseconds, which corresponds to the reaction of a live person.

OpenAI hopes that this product will be a step towards a more natural interaction between the user and the computer. GPT-4o can also act as a fast voice translator between interlocutors speaking different languages.

Addendum Addendum

Voice mode works through the synergy of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes the text and outputs text, and a third simple model converts this text back to audio. In addition, compared to existing language models, GPT-4o is better at understanding images and audio.

The new technology will be introduced gradually over the coming weeks. Separately, the company will present a PC application with new features.

Unlike GPT-4 Turbo, this product is free, but paid users will have access to more features.

OpenAI is preparing a search product, challenging Google09.05.24, 19:16 • [views_23481]

Lilia PodolyakTechnologies

Friday, December 5, 2025 • 8422 переглядiв

St. Nicholas Day: traditions, customs, and prohibitions

Friday, December 5, 2025 • 19493 переглядiв

Six regions switched to emergency power outages - Ukrenergo

Friday, December 5, 2025 • 21813 переглядiв

Occupiers shot Ukrainian prisoner in Svyato-Pokrovske in Donbas - DeepState

Friday, December 5, 2025 • 11957 переглядiв

Trump changed architects for White House ballroom