The algorithms that underpin artificial intelligence systems like ChatGPT are unable to learn as they are used, forcing tech companies to spend billions of dollars training new models from scratch. This has been a concern in the industry for some time, but new research suggests there’s an inherent problem with how the models are designed – but there may be a solution.
Most AI today is so-called neural networks, with processing units called artificial neurons, inspired by how the brain works. AI typically goes through distinct stages during its development. First, the AI ​​is trained, and its artificial neurons are fine-tuned by an algorithm to better reflect a particular dataset. Then, the AI ​​can be used to respond to new data, such as text inputs like those entered into ChatGPT. However, once a model’s neurons are set in the training phase, they cannot be updated and learn from new data.
This means that most large AI models need to be retrained when new data becomes available, which can be very costly, especially when the new dataset represents a large portion of the entire internet.
Researchers have wondered whether these models might be able to incorporate new knowledge after initial training, reducing costs, but it was unclear whether this was possible.
Now, Shivhanshu Dohare and his colleagues at the University of Alberta in Canada have tested whether the most common AI models can adapt to continually learn. The team found that after being exposed to new data, a huge number of artificial neurons quickly lost the ability to learn new things, as they became stuck at a value of zero.
“If you think of it like a brain, it’s like 90 percent of the neurons are dead,” D’Hare says. “You don’t have enough neurons to learn with.”
Dhare and his team started by training their AI system from the ImageNet database, which consists of 14 million labeled images of simple objects like houses and cats. But instead of training the AI ​​once and then testing it multiple times to distinguish between the two images, as is the standard approach, they retrained the model for each image pair.
The researchers tested different learning algorithms in this way and found that after thousands of retraining cycles, the networks were unable to learn and their performance deteriorated, with many neurons becoming “dead” – that is, having a value of zero.
The team also trained the AI ​​to simulate the way ants learn to walk through reinforcement learning, a common technique that teaches an AI what success looks like and helps it figure out the rules through trial and error. They tried to adapt this technique to allow for continuous learning by retraining the algorithm after walking on different surfaces, but they found this also led to a significant decrease in learning ability.
The problem appears to be inherent in the way these systems learn, D’Hare says, but there is a workaround: The researchers developed an algorithm that randomly turns on some neurons after each training round, which seems to mitigate the performance degradation. “If[a neuron]dies, you can just revive it,” D’Hare says. “And it can learn again.”
The algorithm looks promising, but needs to be tested on much larger systems before we can be sure it will help, says Mark van der Wilk of the University of Oxford.
“Solving continuous learning is literally a billion-dollar problem,” he says. “If you have a true comprehensive solution that allows you to continuously update your models, you can dramatically reduce the cost of training these models.”
topic:
(Tag ToTranslate) AI