Apple finds new way to spot AI hallucinations

Mar 5, 2026

6:56am UTC

Copy link
Share on X
Share on LinkedIn
Share on Instagram
Share via Facebook
A

pple may not have homegrown AI. But it wants to make sure the technology is done right.

On Tuesday, Apple published research detailing a new way to find and quash incidents of hallucination, the pesky mistakes that an AI model makes when it doesn’t have enough training data and starts making guesses. Apple’s research introduces “Reinforcement Learning for Hallucination Span Detection,” which pinpoints not just when an AI model hallucinates, but where exactly within a line of text the model goes wrong.

Apple’s model gives its AI framework small rewards each time it accurately identifies incorrect phrases or words, based on how closely its responses match those of human evaluators.

  • This turns hallucination detection from a “binary task” into a “multi-step decision-making process,” Apple said in its research.
  • To put it simply, it’s the difference between a teacher saying you failed a test with no explanation and a teacher telling you exactly which answers you got wrong and why.

“Most existing research works focus on a binary hallucination detection problem, where the goal is to determine if the model output contains hallucinations or not,” Apple said in the paper. “While useful, this formulation is limited: in many real-world applications, one often needs to know which specific spans in the model output are hallucinated in order to assess the reliability of the generated content.”

And Apple’s system proved itself, outperforming conventional methods on the RAGTruth Benchmark, an AI truth-checking test for tasks like summarization, question answering, and data-to-text.

Our Deeper View

While Apple may be seen as miles behind in the AI race, this is a misconception. Apple has effectively removed itself from the competition entirely, instead hitching its wagon to Google through a multi-year agreement to use Gemini to power Siri. However, Apple still bears the burden of doing AI right. With almost 2.5 billion devices in the hands of users worldwide, it’s vital that an AI-powered Siri makes as few mistakes as possible, especially if many of those users aren’t AI-savvy. This research is a sign that Apple understands the consequences of getting it wrong.