AI Hallucinations Explained — Why AI Makes Things Up

Protecting Yourself from AI Hallucinations

Always verify specific facts, statistics, and citations from primary sources
Be most sceptical about: proper names, dates, URLs, research citations
Ask the AI to explain its reasoning — errors often surface during explanation
Cross-check important claims with Google or a specialist database
The more obscure the topic, the higher the hallucination risk

A Vivid Example

In 2023, a US lawyer submitted a court brief citing six previous legal cases — all provided by ChatGPT. None of the cases existed. ChatGPT had invented plausible-sounding case names, citations, and even case summaries. The lawyer was sanctioned. This is the canonical hallucination horror story — and the warning it contains is still entirely relevant in 2026.

Why It Happens — The Technical Reason

LLMs are trained to produce the most probable continuation of text given what came before. They're not trained to be accurate in the way a search engine or database is — they're trained to produce fluent, coherent, plausible text.

When asked "What studies support X?", the model doesn't search a database. It predicts what a list of supporting studies would look like — and generates plausible-sounding entries. If the actual studies weren't in its training data, it invents structurally correct but factually false citations.

High-Risk Hallucination Areas

Content Type	Hallucination Risk	Why
Academic citations	Very High	Specific format pattern — easy to fake
Statistics and percentages	High	Plausible numbers, hard to verify
URLs and web links	Very High	URL format is easy to generate incorrectly
Historical dates	Medium	Well-represented in training data, but imprecision common
General concepts and explanations	Lower	Broad patterns, less specificity required
Creative writing	Low	No "correct" answer to hallucinate from

How AI Companies Are Reducing Hallucinations

Retrieval-Augmented Generation (RAG): Rather than relying purely on training data, the AI retrieves relevant documents from a database and uses those as grounded context. Significantly reduces hallucination for factual tasks.

Chain-of-thought reasoning: Asking models to work through problems step-by-step catches some errors that appear in direct responses.

Grounding to sources: Systems like Perplexity and Google AI Overviews cite their sources, making verification easier and reducing fabrication incentive.

The Practical Verification Rule

A simple rule for safe AI use: treat AI output as a first draft from an intelligent but sometimes unreliable colleague. Use it for structure, frameworks, and general knowledge. Verify anything specific — names, dates, numbers, citations — before relying on it or sharing it.

Ask the AI to flag uncertainty: Add "If you're uncertain about any specific facts, statistics, or citations, say so explicitly" to your prompt. This doesn't eliminate hallucinations but makes the AI more likely to hedge on uncertain content rather than fabricate confidently.

Frequently Asked Questions

What is an AI hallucination?

An AI hallucination is when an AI system generates information that sounds credible and is stated confidently, but is factually incorrect. The term is borrowed from psychology — the AI 'perceives' something that isn't there, producing plausible-sounding but false content.

Why do AI systems hallucinate?

LLMs generate text by predicting the most probable next token based on patterns in their training data. They don't have a separate fact-checking mechanism. When generating a response, they can produce text that sounds like it should be true based on related patterns, even if it isn't.

How common are AI hallucinations?

Frequency depends heavily on the task type. For well-represented topics with clear training data, hallucination rates are low. For specific facts (citations, statistics, names, dates), hallucination rates can be substantial — studies suggest 15-30% on factual recall tasks.

Can you tell when AI is hallucinating?

Often no — that's what makes it dangerous. Hallucinated content is typically stated with the same confidence as accurate content. The AI doesn't know it's wrong. This is why verification of specific claims is always necessary.

Which AI hallucinates least?

Benchmarks show Claude and GPT-4 have lower hallucination rates than smaller models. Retrieval-augmented generation (RAG) systems — which check live databases before responding — have significantly lower hallucination rates than pure generation models.