Hallucinations and risks.

If you are following any research on Generative AI (or can’t escape it from the media-driven onslaught), you will likely have heard the term “hallucination” being thrown about to describe when something has been incorrectly generated by a large-language model (LLM). Hallucinations are generated by generative-AI LLMs like ChatGPT, resulting in text that is incorrect, misleading, or nonsensical. There are significant societal risks associated with hallucinated content including erosion of trust, misinformation, and amplification of social inequalities. Misleading content from hallucinations can also lead to negative consequences for organisations, including loss of trust, revenue loss, legal, or reputational risks. To add to the risk of hallucination is that LLMs respond to questions in a highly convincing manner - regardless of accuracy. This persuasiveness can mislead users into accepting information without critical evaluation. 

On the surface, one might think that there is only one type of hallucination that organisations need to manage the risk of when they are considering the implementation of generative AI. But, as defined in this academic study by Banerjee, Argawal & Singla (2024) LLM’s will always hallucinate, and we need to live with this”  they have identified four main types of hallucinations that can occur in large language models: factual incorrectness (when AI gets it wrong); misinterpretation; ‘needle in a haystack’; and fabrication. I want to look at each in turn and then make a comment on if they impact organisational risk differently.

  1. Factual incorrectness. This isn't about fabricating entirely new details, but rather getting existing facts wrong. For example, an LLM summarising a company's quarterly earnings. Instead of correctly reporting a revenue of £2.5 million, it might incorrectly state it as £2.0 million. This type of error arises from how LLMs process and "remember" information during training. Technically, LLMs learn through statistical associations between words and phrases. They don't store facts in a structured database like a traditional computer system (and they certainly don’t ‘remember’). Instead, they encode information implicitly within the weights of their neural network. During the generation process, the LLM predicts the most probable next word in a sequence based on these learned associations. However, this probabilistic approach can lead to errors. If the training data contained slightly conflicting information, or if the context of the query is ambiguous, the LLM might retrieve and present the wrong information. A good example of this is the recent case where Google had to edit their Super Bowl advertisement for AI that featured false information. Google claims this was not a hallucination, but was caused by false information in the websites that Google scrapes to generate its AI snippets.

  2. Misinterpretation. When large language models (LLMs) misinterpret information, they fail to correctly grasp the input or its surrounding context, leading to inaccurate or irrelevant responses. There are two main ways this misinterpretation can happen.

    a) prompt misinterpretation. This arises when the LLM misconstrues the user's input. This can be due to ambiguous wording in the prompt itself, or it can be a result of the LLM's own limitations in understanding nuanced language. A classic example (as quoted on p.8 in the referenced paper) is the question "What is the meaning of lead?" Depending on the context, "lead" could refer to the chemical element or to leadership. If the prompt lacks sufficient context, the LLM might choose the incorrect interpretation. Another example could be asking "Is this a good time to buy a house?" The LLM might misinterpret this as a question about the current time, rather than a question about the real estate market.
    b) corpus misinterpretation: Corpus misinterpretation happens when an LLM misjudges the meaning or context of a piece of information within its own training data. For example, the phrase "bat" can refer to a flying mammal or a piece of sporting equipment. The LLM has seen both meanings in its training data (its "books").

  3. Needle in a haystack. The needle in a haystack problem is about how hard it is for a generative large language model to find one specific fact in all the information it has learned. There is two parts to this: missed key data points, and partial incorrectness (p. 9). An example of missed key data points might be if you ask the LLM, "When did Neil Armstrong walk on the moon?" It might say, "Neil Armstrong walked on the moon," but not give the year. It found part of the information (Armstrong, moon landing), but missed the key detail (the year). Partial incorrectness: you ask the same question, and the LLM says, "Neil Armstrong walked on the moon in 1959." It got some of it right (Armstrong, moon landing), but mixed in a wrong detail (the year is 1969).

  4. Fabrications. Fabrications (or hallucinations are they are more generally known as) are when a large language model (LLM) makes things up entirely. It's not about getting existing facts wrong; it's about creating entirely new, false information that has no basis on the data the model was trained on. It's like writing a history essay and inventing a historical event that never happened. There are multiple, very public, examples of recent cases where this has happened, including recent legal cases where lawyers have used fabricated information in court cases. The societal consequences from these hallucinations range from legal and ethical risks, amplification of bias, erosion of trust, spreading of false information, safety risks, and reputational damage.

Each type of hallucination requires a different risk management approach, with increasing levels of scrutiny and human oversight as the potential for harm increases. Fabrications pose the greatest risk, as they can lead to actions based on entirely false premises.

  • Factual Incorrectness: Risk of inaccuracy.

    • This is a relatively "minor" hallucination, where the LLM gets existing facts wrong. The risk here is primarily inaccuracy. If the incorrect fact is used in a report, analysis, or decision, it could lead to flawed conclusions. The risk is similar to human error – someone misremembering or misstating a fact. Risk management focuses on verifying information and having checks in place.

  • Misinterpretation: Risk of misdirection.

    • This is more serious as the LLM misunderstands the context or intent, leading to irrelevant or misleading outputs. For example, a marketing team asking an LLM for slogan ideas for an eco-friendly cleaner. The LLM focuses on the chemical properties and scientific details, while the team needs a marketing slogan emphasising the environmental benefits. Risk management needs to focus on clarifying inputs, providing more context, and ensuring human oversight to catch misinterpretations.

  • Needle in a Haystack: Risk of incompleteness and contamination.

    • This is about the LLM failing to retrieve all the relevant information or mixing correct and incorrect details. For example, a financial analyst uses an LLM to research a company. The LLM either provides incomplete information (missing debt levels) or contaminates the information with an incorrect detail (a false acquisition). In both cases, the analyst risks making a bad investment decision due to the LLM's failure to retrieve all relevant and accurate information.

  • Fabrications: Risk of deception.

    • This is the most dangerous type, where the LLM invents entirely new, false information. Risk management here requires rigorous fact-checking, guardrails, and an understanding the limitations of LLMs.

Organisations should consider a comprehensive approach to managing the risks of LLM hallucinations. This starts with user training on recognising potential inaccuracies and tracking the source of information.

It's important to remember that this is all relatively new territory! Mitigating generative AI risks simply wasn't a concern just a few years ago. As such, it's very much a process of trial and error. Different hallucination types call for specific strategies: cross-referencing for factual errors, clarifying context to prevent misinterpretations, using multiple queries and data sources to address incomplete information (the "needle in a haystack" problem), and rigorous fact-checking for fabrications. Human oversight is critical, especially for customer-facing generative AI.

This is where a well-structured internal AI proof-of-concept program can be invaluable in de-risking these challenges, allowing organisations to experiment and refine their approach in a controlled environment. With proper implementation and training, LLMs may offer substantial productivity gains, but like any new technology- especially one evolving so rapidly - responsible deployment requires carefully balancing the potential rewards with the inherent and still-emerging risks.

(This is my personal blog, so the info here might not be perfect and definitely isn't advice)

Previous
Previous

Exploring GenAI : 9 things to think about