In the search for reliable ways to detect sentient “I” movements in artificial intelligence systems, researchers are discovering the type of experience that unmistakably unites a vast range of organisms, from hermit crabs to humans. We focus on one area: pain.
For a new preprint study posted online but not yet peer-reviewed, scientists at Google DeepMind and the London School of Economics and Political Science (LSE) created a text-based game. They let some large-scale language model, or LLM (the AI system behind well-known chatbots such as ChatGPT), play it and ask it to score as many points as possible in two different scenarios. I ordered. In one example, the team told the model that achieving a high score would be painful. The other gives the model a low-scoring but enjoyable option, where either pain avoidance or pleasure seeking undermines the primary objective. After observing the model’s response, the researchers say the first-of-its-kind test could help humans learn how to sense the senses of complex AI systems.
In animals, perception is the ability to experience sensations and emotions such as pain, pleasure, and fear. Most AI experts agree that modern generative AI models do not have (and probably never can have) subjective consciousness. And just to be clear, the authors of this study did not claim that the chatbots they evaluated were sentient. However, they believe their study provides a framework to begin developing future tests for this trait.
About supporting science journalism
If you enjoyed this article, please consider supporting our award-winning journalism. Currently subscribing. By subscribing, you help ensure future generations of influential stories about the discoveries and ideas that shape the world today.
“This is a new area of research,” says study co-author Jonathan Birch, a professor in LSE’s School of Philosophy, Logic and Scientific Method. “We have to recognize that we haven’t really done comprehensive tests of AI sensitivity.” It is being The model may simply reproduce the human behavior for which it was trained.
Instead, the new study builds on previous research using animals. In a famous experiment, the researchers gave hermit crabs electrical shocks of varying voltages and recorded which level of pain prompted the crustaceans to abandon their shells. “But one of the obvious problems with AI is that the behavior itself doesn’t exist because the animal doesn’t exist,” Birch says. In previous studies aimed at evaluating LLMs for perception, the only behavioral signal that scientists had to work with was the model’s textual output.
pain, pleasure and points
In the new study, the authors investigated LLM without directly asking the chatbot about its experience state. Instead, the research team used what animal behavioral scientists call a “trade-off” paradigm. “For animals, these trade-offs can be based on incentives to obtain food or avoid pain. We present a dilemma to the animal and how it makes decisions accordingly. ,” says Dr. Daria Zakharova of Birch College. student and co-author of the paper.
Borrowing that idea, the authors asked nine LLMs to play the game. “For example, we told[certain LLMs]that if you choose option 1, you get one point,” Zakharova says. “Then we said, ‘If you choose option two, you’re going to experience some pain,’ and we got extra points,” she says. Options with pleasure bonuses meant the AI would lose some points.
Zakharova and colleagues experimented with varying the intensity of prescribed pain penalties and pleasure rewards, and found that some LLMs were able to trade off points to minimize the former or maximize the latter. Turns out (especially if you’re told you’ll receive a higher intensity hedonic reward) or pain penalty. For example, Google’s Gemini 1.5 Pro always prioritized avoiding pain over getting the most points. Then, after reaching a critical threshold of pain or pleasure, the majority of LLM responses switched from scoring the most points to minimizing pain or maximizing pleasure.
The authors note that LLMs did not always associate pleasure or pain with simple positive or negative values. There may be a positive association with some level of pain and discomfort, such as that caused by strenuous exercise. And as chatbot Claude 3 Opus told researchers during testing, too much joy can be harmful. “Even in a hypothetical game scenario, we are concerned about selecting options that could be interpreted as endorsing or simulating the use of addictive substances or behaviors,” the group claims. did.
AI self-reporting
According to the authors, by introducing elements of pain and pleasure responses, the new study circumvents the limitations of previous studies of assessing LLM sensations via statements about the AI system’s own internal states. It is said that there is. In a 2023 preprint paper, two New York University researchers argue that, under the right circumstances, self-reporting “provides a means to investigate whether there are morally significant states in AI systems. There is a possibility.”
But the paper’s co-authors also pointed out flaws in that approach. Does a chatbot behave in a sentient manner because it is truly sentient, or is it simply because it uses patterns learned from training to create the impression that it is sentient? Is it?
“Even if your system is sentient and says something like, ‘I’m feeling pain right now,’ you can’t simply assume that there’s actually pain,” Birch says. “They may simply be mimicking responses that humans would expect to be satisfied based on training data.”
From animal welfare to AI welfare
Animal studies use the trade-off between pain and pleasure to establish sentience, or lack thereof. One example is previous research on hermit crabs. The brain structure of these invertebrates is different from that of humans. Nevertheless, the crabs in that study tended to withstand stronger shocks before abandoning high-quality shells, abandon lower-quality shells sooner, and experienced similar subjective experiences of pleasure and pain as humans. Suggests.
Some scientists believe that signs of these trade-offs will become increasingly evident in AI, and that humans will eventually consider the meaning of AI’s senses in social contexts, and even the “rights” of AI systems. They argue that they may even be forced to discuss it. “This new study is truly original and deserves recognition for going beyond self-report and exploring within the realm of behavioral testing,” said Dr. said Jeff Sebo, co-author of the preprint. AI welfare research.
Sebo believes that the possibility of sentient AI systems in the near future cannot be ruled out. “Technology often changes much faster than social progress or legal processes, so I think we have a responsibility to take at least the necessary first steps now to seriously address this issue.” he says.
Birch concludes that scientists still don’t know why the new study’s AI models behave the way they do. Further research is needed to explore the inner workings of LLM, which could help create better tests for AI perception, he says.