We're not Hallucinating, Anthropic got it right!

In an Industry Obsessed with Speed, Anthropic Paused to Think

May 30, 2025

A surreal, high-resolution image symbolizing artificial intelligence hallucinations and dual realities. Set in a barren desert with rocky outcrops under a clear blue sky, the centerpiece is a large vertical mirror or portal reflecting a serene ocean view bathed in golden sunset light, complete with clouds and sparkling waves. The mirror creates a striking visual contrast between the arid desert landscape and the illusion of a vibrant seascape, symbolizing how generative AI models like large language models (LLMs) can convincingly fabricate alternate realities. This image metaphorically represents the theme of today’s article on AI hallucinations, trust in machine-generated content, and Anthropic’s Responsible Scaling Policy for Opus 4—emphasizing the need for safeguards and transparency in advanced AI systems to prevent misinformation and ensure responsible use. — Credit: Midjourney v7 User: shiv6478

Listen to this article. 10m48s. Learn more

0:00

-10:48

May 30th, 2025. Hey, it’s Khaled! Thank you for signing up to the Exec X AI Magazine — a business weekday signal from the frontlines of AI and executive-decision making!

If you find value here, do us a solid: share our emails with your contacts, hit the heart-shaped like button or drop a comment. It boosts visibility, fuels better conversations and helps us keep this publication free and open for everyone ♥️

Today’s post features two articles which we’ve condensed into a digestible form for time-pressed readers:

When Gen AI Hallucinates: A Cautionary Tale and What Every Executive Needs to Know
Anthropic’s Masterstroke: Launch a New AI Model. Then Raise the Alarm

When Gen AI Hallucinates: A Cautionary Tale, and What Every Executive Needs to Know

Brian hadn’t slept. The corporate M&A lawyer had spent the night buried in a virtual deal room, trawling through hundreds of documents as Acme Corp prepared to acquire a logistics software startup.

The due diligence memo was complete. The General Counsel had reviewed and signed it off at 3:12 am sharp.

At the end of their call, the GC asked Brian for one last task: a cover email that she could use to baseline her due diligence pack for the board. She wanted it before boarding her 7:45 am flight to Frankfurt.

Brian decided to defer the email. He figured he’d write it fresh with a clear(er) head in the morning. He overslept.

Rushing to his laptop, at 6:10 am, coffee in one hand and a blinking cursor in the other, he realised he was out of gas.

Desperate, he opened his favourite Gen AI tool. He pasted a redacted version of the memo — names and sensitive details stripped — and prompted the model to draft a polished summary email.

Within seconds, the AI produced a message that was sharp, composed, and confident. It even added a plane emoji (✈️) to the proposed subject line. “Impressive” he thought.

Copy. Paste. Attach. Send. The GC would have it in her inbox before stepping onto the plane. “Bonus time” Brian thought!

But then, a chill.

Scanning the email in his Sent folder, Brian spotted something strange: a case citation in the body of the message. He didn’t remember including it.

He opened the memo. No sign of the case. Checked Acme’s internal legal research tool. Nothing. Then LexisNexis. Still nothing. He Googled it.

No such case existed.

In a quiet panic, Brian flipped back to the AI tool he used and asked it to source the reference. The model responded with plausible-sounding nonsense referring to a jurisdiction that didn’t exist — a hallucination.

He clicked “Recall Message.” Too late. A read receipt had already landed. The GC had opened the email.

The Problem Isn’t Style. It’s Substance.

This isn’t a hypothetical—it’s a story replaying in legal departments, banks, and boardrooms around the world. Gen AI is the ultimate productivity drug, but too often we confuse its fluency with truth.

A Hallucination is a phenomenon where AI systems fabricate plausible-sounding information that’s entirely false. A fake legal case. An invented quote. A policy that doesn’t exist.

AI systems aren’t databases. AI systems don’t know facts—they predict likely word patterns based on massive language training.

According to the The New York Times AI hallucinations could occur more frequently (Ed. Note - gift link. No paywall).

“…reasoning models are designed to spend time “thinking” through complex problems before settling on an answer. As they try to tackle a problem step by step, they run the risk of hallucinating at each step. The errors can compound as they spend more time thinking”.

That’s why its so vital to check a Gen AI’s outputs and its reasoning steps.

AI needs to be trained to say “I might be wrong”

As an AI evangelist, I’m going to say it — plain and simple — in order to earn their users’ trust, Gen AI models must supply a confidence rating next to every output provided. Google DeepMind’s Alphafold model provides a confidence rating for protein pairs it has sequenced — this should be the gold standard for all Gen AI models.

Here’s an example. I asked ChatGPT: What is the national food dish of the Central African Republic?

Screenshot showing a generative AI’s response to the prompt: “What is the national food dish of the Central African Republic?” The AI confidently asserts that the national dish is “cassava with sauce,” and lists several other dishes including Gozo (or Foufou/Fufu), Muamba de galinha, Chikwangue, and Kanda ti nyama. However, the AI provides no indication of uncertainty or confidence levels, despite the fact that the Central African Republic is home to over 80 ethnic groups with diverse culinary traditions. The image underscores a critical issue: generative AI models often present culturally complex or contested topics with unqualified certainty, potentially misleading users into assuming accuracy where ambiguity exists. This highlights the need for AI systems to include probabilistic qualifiers or confidence scores in their outputs. — *Screenshot of generative AI output dated May 30, 2025, in response to the query: “What is the national food dish of the Central African Republic?”*

But the CAR is home to over 80 ethnic groups, each with distinct culinary traditions. To declare one dish as definitive is to reduce cultural complexity to a soundbite.

That’s not just intellectually lazy—it’s operationally dangerous. When large language models (LLMs) speak with unwavering confidence, users assume authority. And here’s the rub: AI doesn’t need to be infallible, but it must be transparent about when it’s presenting guesstimates.

(Ed. Note - credit to Kate Kallot, CEO of Amini and one of TIME100’s most influential people in AI for for bringing this example to my attention).

Why Generative AI Hallucinates: A Quick Primer

At a foundational level, hallucinations happen because:

AI doesn’t store facts—it predicts text. Ask “What’s the name of the body of water on the west coast of Florida?” and the model completes the sentence based on statistical likelihood, not database truth.
It fills in blanks—even when it shouldn’t. When uncertain, it guesses based on linguistic patterns.
It lacks common sense. Combine “Apple” with “power” in the same prompt, and it might describe how much energy you’ll need to fuel a workout, and not how long it will take to charge your smartphone.
It can’t say “I don’t know.” Most models are trained to produce something—anything. And rather than admit uncertainty, AI studios need to keep you engaged with their Gen AI tools to continue training their models. Saying “I don’t know” risks turning you off, and on the path to using a different platform.

These aren’t bugs. They’re the result of how today’s LLMs fundamentally operate.

Why It Matters for Executives

If you’re a General Counsel or Chief Risk Officer, hallucinations aren’t just technical hiccups—they’re enterprise risk vectors. A hallucinated clause in a contract, or a fake case in court filings can expose your firm to reputational and regulatory damage.

The stakes aren’t limited to legal. CFOs should worry too: financial summaries and reports generated with unverified Gen AI can embed errors that mislead auditors, investors, and markets.

And CEOs? This is your brand on the line.

The Executive Playbook: How to Stay Smart

Here’s how to keep Gen AI safe, productive, and defensible inside your organisation:

☑️ Never let your mistakes leave the kitchen. To this end, instruct your teams never to ship content influenced by AI without human review—especially in legal, regulatory, and public contexts.

☑️ Build AI muscle memory. Train staff on how LLMs work, not just how to prompt them. Make it fun, and give kudos to employees who share best practice.

☑️ Designate an Office of Responsible AI. Governance matters. Track data lineage, enforce prompt hygiene, and stay audit-ready.

☑️ Encourage AI skepticism. Create a culture where verifying AI’s output is a badge of professionalism, not a burden.

☑️ Ask your GC to read Exec X AI’s Tactical Playbook for Adopting AI. We’ve set out a plan of action for legal departments to build infrastructure and the muscle to guard organisations against the risks of AI going rogue.

Tactical Playbook: GCs, Is Your Legal Team Ready to Adopt AI?

Khaled Shivji

April 23, 2025

Read full story

Exec X AI’s sister company, Solutions and AI for Lawyers (SAIL) provides training advisory services, and policy drafting workshops for executives, boards, and leaders who want to safely accelerate their AI adoption strategy. Book a free consultation here.

Want to share your story?

The world is moving so quickly and your story matters! If you’re a CFO, GC, business leader—or just someone with a front-row seat to AI adoption—I’d love to hear your story. Let’s talk. I’m conducting interviews for future editions of the Exec X AI Magazine. Get in touch: hello@exec-x.ai.

Anthropic’s Masterstroke: Launch a New AI Model. Then Raise the Alarm

Anthropic released a warning alongside the release of Claude Opus 4. Anthropic claims Opus 4 is the most intelligent and powerful model it has ever developed.

According to the company, Opus 4 could offer:

“The ability to significantly help individuals or groups with basic technical backgrounds (e.g., undergraduate STEM degrees) create/obtain and deploy CBRN* weapons”

(Anthropic’s Responsible Scaling Policy v 2.2, May 14, 2025)

*CBRN: Chemical, Biological, Radiological and Nuclear weapons - nasty stuff.

I need to caveat that Anthropic stressed this is a precautionary measure. The company’s Responsible Scaling Policy positions AI Safety Level 3 (ASL-3) as the highest security standard that it could attach to new models.

“To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections…we proactively decided to launch it under the ASL-3 Standard. This approach allowed us to focus on developing, testing, and refining these protections before we needed them”.

(Anthropic’s ‘Activating AI Safety Level 3 Protections’, May 22, 2025)

The Final Word

Anthropic’s Responsible Scaling Policy is a superb example of how to classify AI product safety. Operationalising AI safety is even harder.

However, its great to see the company’s co-founder and Chief Science Officer, Jared Kaplan exerting more influence since being appointed as the company’s Responsible Scaling Officer.

In an industry obsessed with speed, Kaplan forced Anthropic to pause and think. He proved responsible AI isn’t a buzzword. It’s now a product strategy.

Bravo Anthropic!

Tactical Playbook: GCs, Is Your Legal Team Ready to Adopt AI?

Discussion about this post

Ready for more?