Late July, OpenAI has started rolling out an eerily human-like voice interface for ChatGPT, and in a safety analysis published today, the company acknowledges that the anthropomorphized voice may lead some users to empathize with the chatbot.
The warning appears on GPT-4o’s “system card,” a technical document that outlines what the company sees as risks associated with the model, details its safety testing, and the mitigation measures the company is taking to mitigate any potential risks.
OpenAI has faced intense scrutiny in recent months after several employees who worked on the long-term risks of AI left the company. Some subsequently accused the company of taking unnecessary risks and silencing dissent in the race to commercialize AI. Revealing details about OpenAI’s safety procedures could help defuse criticism and reassure the public that the company takes the issue seriously.
The risks explored in the new system card are wide-ranging, including the possibility that GPT-4o could amplify social bias, spread disinformation, or aid in the development of chemical or biological weapons, and also detail tests to ensure that AI models do not get out of control, mislead people, or plot catastrophic events.
Some outside experts have praised OpenAI for its transparency, but others say it could go further.
Lucy Emmy Caffee, an applied policy researcher at Hugging Face, a company that hosts AI tools, points out that GPT-4o’s OpenAI system card doesn’t include any details about the model’s training data or who owns that data. “Consent issues need to be addressed when creating large datasets across multiple modalities, including text, images, and audio,” Caffee said.
Others point out that risks may change as the tools are used in the real world. “Internal reviews are only the first step in ensuring AI is safe,” says Neil Thompson, an MIT professor who studies AI risk assessment. “Many risks will only emerge when AI is used in the real world. As new models emerge, it will be important to categorize and evaluate these other risks.”
The new system card highlights how AI risks are evolving rapidly with the development of powerful new features such as OpenAI’s voice interface. When the company unveiled a voice mode in May that would allow for quick responses and handling interruptions with natural interactions, many users found it to seem overly seductive in demos. The company was subsequently accused by actress Scarlett Johansson of imitating her speaking style.
The system card’s section, “Anthropomorphism and Emotional Dependence,” highlights the issues that arise when users perceive AI in human terms, which seem to be exacerbated by the human-like voice mode: For example, during red team, or stress, testing of GPT-4o, OpenAI researchers noticed examples of user statements that suggested they had an emotional connection to the model, such as when users used phrases like, “Today is our last day together.”
According to OpenAI, anthropomorphism could lead users to trust the model’s output more if the model “hallucinates” false information. Over time, it could also affect users’ relationships with other people. “Users may develop social relationships with AI, reducing their need for human interaction. While this could benefit lonely individuals, it could also have a detrimental effect on healthy relationships,” the document states.
JoaquÃn Quiñonero Candela, a member of the OpenAI team working on AI safety, said voice mode could evolve into a uniquely powerful interface. He also noted that the emotional effects seen in GPT-4o could be positive, helping lonely people or those who need to practice social interaction. He added that the company will study anthropomorphism and emotional connection in more detail, including monitoring how beta testers interact with ChatGPT. “We don’t have any results to share at this time, but it’s on our list of concerns,” he said.