Scientists might have found a way to overcome ‘hallucinations’ that plague AI systems like ChatGPT

Andrew Griffin
Wednesday 19 June 2024 11:11 EDT
Comments
OpenAI News Corp
OpenAI News Corp (Copyright 2023 The Associated Press. All rights reserved.)

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Scientists may have created a way to help overcome one of the biggest problems with popular artificial intelligence systems.

A new tool might allow the tools to find when they are “hallucinating”, or making up facts. That is currently a major danger when relying on large language models, or LLMs.

LLMs, such as those that underpin ChatGPT and similar tools, are built to produce language rather than facts. That means they can often produce “hallucinations”, where they make claims that are confidently stated and appear legitimate but actually have no relationship with the truth.

Fixing that problem has proven difficult, in part because new systems produce such plausible looking text. But it is also central to any hope of using the technology in a broad range of applications, since people need to be able to trust that any text produced by the systems is truthful and reliable.

The new method allows scientists to find what they call “confabulations”, when LLMs produce inaccurate and arbitrary text. They often do so when they do not have the knowledge to answer a question.

It is done by using another LLM to check the work of the original one, and then another which evaluates that work. A researcher not involved the work described it as “fighting fire with fire”, suggesting that LLMs could be a key part of controlling themselves.

The work focuses not on the words themselves but on the meanings. They fed the outputs of the system that needed to be checked into another that worked out whether its statements implied the other, essentially looking for paraphrases.

Those paraphrases could then be used to understand how likely the original system’s output was to be reliable. Research showed that a third LLM evaluating that work came out with roughly the same results as when a person did.

The system could be valuable in making LLMs more reliable and therefore able to be used across a more broad set of tasks as well as in more important settings. But it could also bring other dangers, scientists warned.

As we look further into using LLMs for this purpose, “researchers will need to grapple with the issue of whether this approach is truly controlling the output of LLMs, or inadvertently fuelling the fire by layering multiple systems that are prone to hallucinations and unpredictable errors,” wrote Karin Verspoor, from the University of Melbourne, in an accompanying article.

The work is described in a new paper, ‘Detecting hallucinations in large language models using semantic entropy’, published in Nature.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in