OpenAI unveils huge upgrade to ChatGPT that makes it more eerily human than ever

A picture of a phone screen displaying GPT-4o in front of OpenAI's logo. (Image credit: Future)

A new version of ChatGPT can read facial expressions, mimic human voice patterns and have near real-time conversations, its creators have revealed.

OpenAI demonstrated the upcoming version of the artificial intelligence (AI) chatbot, called GPT-4o, in an apparently real-time presentation on Monday (May 13). The chatbot, which spoke out loud with presenters through a phone, appeared to have an eerie command of human conversation and its subtle emotional cues — switching between robotic and singing voices upon command, adapting to interruptions and visually processing the facial expressions and surroundings of its conversational partners.

During the demonstration, the AI voice assistant showcased its skills by completing tasks such as real-time language translation, solving a math equation written on a piece of paper and guiding a blind person around London's streets.

"her," Sam Altman, OpenAI's CEO, wrote in a one-word post on the social media platform X after the presentation had ended. The post is a reference to the 2013 film of the same name, in which a lonely man falls in love with an AI assistant.

To show off its ability to read visual cues, the chatbot used the phone’s camera lens to read one OpenAI engineer’s facial expressions and describe their emotions.

"Ahh, there we go, it looks like you're feeling pretty happy and cheerful with a big smile and a touch of excitement," said the bot, which answered to the name ChatGPT. "Whatever is going on, it looks like you're in a good mood. Care to share the source of those good vibes?"

If the demonstration is an accurate representation of the bot's abilities, the new capabilities are a massive improvement on the limited voice features in the company's previous models — which were incapable of handling interruptions or responding to visual information.