Saturday, July 6, 2024
HomeInternet and Social MediaAI providers could enter into contracts with publishers to improve the accuracy...

AI providers could enter into contracts with publishers to improve the accuracy of LLMs.

While many have declared the arrival of advanced generative AI to be the end of publishing as we know it, the past few weeks have seen new changes as a result of the AI ​​shift that could actually bring major benefits to publishers.

While AI tools and the large-scale language models (LLMs) that power them can produce remarkably human-like results for both text and visuals, we are increasingly finding that the actual input data is crucial, and that more data is not necessarily better in this regard.

Take, for example, Google’s latest generative AI search component and the strange answers it’s sharing.

Google AI's Answer

Google CEO Sundar Pichai has acknowledged that the company’s system has flaws, but in his view these are actually inherent in the design of the tool itself.

According to Pichai (via The Verge):

It gets to the deeper point that hallucinations are an open question. In a way, that’s an inherent feature. That’s what makes these models so creative. (…) But the LLM is not necessarily the best approach to always get to the facts.”

But platforms like Google present these tools as systems where you can ask questions and get answers, so when you don’t get an accurate answer, that’s a problem, not something you can explain away as a random occurrence that will inevitably happen all the time.

That’s because, while the platforms themselves may be keen to temper expectations around accuracy, consumers already rely on chatbots for exactly that.

In this respect, it is somewhat surprising that Pichai acknowledges that while AI tools can provide answers to searchers, they cannot provide “facts.” But the takeaway here is that there will inevitably be a shift in focus to data at scale, and for such systems to produce relevant and useful results, it will not just matter how much data they can incorporate, but also how accurate that data is.

This is where journalism and other quality inputs come in handy.

OpenAI has already signed a new contract with NewsCorp, It has already started incorporating News Corp publications into the model, and Meta is reportedly currently considering doing the same. So while publications may well be losing traffic to AI systems that give searchers all the information they need within the search results screen itself, or in chatbot responses, they might, in theory at least, be able to recoup at least some of these losses through data-sharing agreements designed to improve the quality of LLMs.

Such contracts could also reduce the influence of suspected partisan news providers by filtering their input from the same model. For example, if OpenAI signed contracts with all mainstream publishers to filter out more “hot take” style conspiracy theorists, the accuracy of ChatGPT’s responses would certainly improve.

In this respect, it will be important to build precision into these models through partnerships with established and trusted providers, including academic publishers, government websites, scientific societies, etc., rather than integrating the entire internet.

Google is already in a good position to do this because it has filters in place through its search algorithms that prioritize the most accurate and best sources of information. In theory, Google could refine the Gemini model to, say, filter out all sites below a certain quality bar, and the model would improve immediately.

Of course, there’s more to it than that, but the concept is that we’ll increasingly see LLM creators moving away from building the biggest models possible and towards more refined, higher quality inputs.

This could also be bad news for Elon Musk’s xAI platform.

xAI, which recently raised another $6 billion in funding, aims to build a “maximum truth-seeking” AI system that isn’t bound by political correctness or censorship. To achieve this, xAI is driven by X-posts, which may have advantages in terms of timeliness, but not so much in terms of accuracy.

Many false and ill-informed conspiracy theories still garner support in X, often propagated by Musk himself. Given these broader trends, it seems more of a hindrance than a benefit. Of course, Elon and his many supporters will see this the other way, as their left-of-center views being “silenced” by the mysterious puppet masters they are railing against this week. But the truth is that most of these theories are wrong, and incorporating them into xAI’s Grok model will only make its responses less accurate.

But in the broader picture, this is where we’re heading. Most of the structural elements of current AI models are already established, and data inputs are the biggest challenge going forward. As Pichai points out, some of this is unique and will always be there as these systems try to make sense of the data provided to them. But over time, the demand for accuracy will increase, and as more websites block OpenAI and other AI companies from scraping their URLs for LLM inputs, they’ll need to enter into data agreements with more providers anyway.

Curating these providers can be seen as censorship and can lead to other problems, but it also means that responses from AI bot tools will be more accurate and factual.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments

error: Content is protected !!