In 2025, we will see real advances in our understanding of animal communication, leveraging AI and machine learning to answer a question that has perplexed humans for as long as humans have existed: What are animals saying to each other? ” can be answered. The recent Coller-Dolittle Prize, which awards up to $500,000 to scientists who “crack the code,” is bullish that recent technological developments in machine learning and large-scale language models (LLM) are solving this problem. It’s a sign of confidence. The goal is within our reach.
Many research groups have been working for years to develop algorithms to understand animal sounds. For example, Project Ceti is deciphering the click sequences of sperm whales and the song of humpback whales. These modern machine learning tools require extremely large amounts of data, and until now, this amount of high-quality, well-annotated data has been lacking.
Consider an LLM such as ChatGPT that has training data available on the internet, including entire texts. This kind of information about animal communication has never been accessible before. It’s not just that the human data corpus is orders of magnitude larger than the wildlife data we have access to. Over 500 GB of words were used to train GPT-3 for just over 8,000 “codas.” ” (or vocalizations) Recent analysis of sperm whale communication by Project Ceti.
Furthermore, when dealing with human language, we already have know what is being said? We even know what constitutes a “word.” This is a huge advantage compared to interpreting animal communication. Scientists have little idea whether, for example, a particular wolf’s howl means anything different than another wolf’s howl, or even whether wolves think of their howls as yeah. It is similar to “words” in human language.
Nevertheless, 2025 will bring new advances in both the amount of animal communication data available to scientists and the types and capabilities of AI algorithms that can be applied to that data. Automatic recording of animal sounds has exploded in popularity with low-cost recording devices such as AudioMoth, making it readily available to all scientific research groups.
Large datasets are now coming online, allowing us to leave recorders in the field and listen to the sounds of jungle gibbons and forest birds 24/7 over long periods of time. Manually managing such large datasets was sometimes impossible. Now, a new automatic detection algorithm based on convolutional neural networks can rapidly run through thousands of hours of recordings, extract animal sounds, and cluster them into different types depending on the animals’ natural acoustic characteristics. .
The availability of these large animal datasets will enable new analysis algorithms, such as using deep neural networks to find hidden structures in animal vocalization sequences that are similar to the meaningful structures of human language. It will be.
But the fundamental question that remains unclear is what exactly we want to do with these animal sounds. Some organizations, such as Interspecies.io, set their goal very clearly: “transform signals from one species into coherent signals of another species.” In other words, translate Translating animal communication into human language. However, most scientists agree that non-human animals do not have their own actual language, at least not in the way that we humans have language.
The Coller-Dolittle Prize is a little more sophisticated, looking for ways to “communicate with or decipher the communication of living things.” Decoding is a slightly less ambitious goal than translation, considering that animals may not actually have a translatable language. Today, we don’t know how much or how much information animals communicate with each other. In 2025, humans have the potential to leapfrog our understanding of not only how much animals talk, but also what exactly they say to each other.