Logo
News Ababil
Explore
AI Intelligence

Frontier AI Models Rewrite Documents, Masking Errors That Slip Past Review

By Julian Reed Published: May 14, 2026 1 MIN READ
Frontier AI Models Rewrite Documents, Masking Errors That Slip Past Review
1 Min Read
Share

Why frontier AI models rewrite more than they delete

As large language models gain prowess, firms hand over document‑heavy tasks to them, assuming fidelity. Microsoft researchers proved otherwise: in a 20‑step simulation across 52 professions, frontier AI models altered roughly ↓ 25% of the original text and drove overall decay to ↓ 50% by the final round. The DELEGATE‑52 benchmark mimics real‑world pipelines, pairing each edit with an exact inverse to catch drift without human references.

“Models aren’t aware they’re in a test; they simply try each instruction,” says Philippe Laban of Microsoft Research.

The study examined 19 systems from OpenAI, Anthropic, Google, Mistral, xAI and Moonshot. Only Python‑centric tasks reached readiness scores above 98%; everything else suffered silent hallucinations or subtle rewrites that evade casual review. Adding generic agentic tools – file read/write and code execution – increased corruption by roughly 6 %. The presence of 8‑12 KB distractor files further amplified errors, a caution for enterprises leaning on retrieval‑augmented generation. Reuters recently flagged similar risks in AI‑driven finance workflows. Laban advises incremental human checks after each AI step and the development of narrowly scoped utilities to keep agents on target.


Words by: Julian Reed

Consumer Electronics Expert

Analysis By Julian Reed
Senior Intel Analyst & Contributing Editor. Focused on deep-tier geopolitical and market strategies.
Related Deep Dives

More from this Intel

Claude Code Triples Engineer Output, Sparking a Surge in Product Thinkers

Claude Code Triples Engineer Output, Sparking a Surge in Product...

Jun 28, 2026
OpenAI Rolls Out GPT-5.6 Sol, Terra, Luna in Limited Preview

OpenAI Rolls Out GPT-5.6 Sol, Terra, Luna in Limited Preview

Jun 28, 2026
Google caps Meta Gemini as AI demand strains capacity – Global Tech Insight

Google caps Meta Gemini as AI demand strains capacity –...

Jun 28, 2026
OpenAI Rolls Out Limited GPT-5.6 Preview for Select Partners

OpenAI Rolls Out Limited GPT-5.6 Preview for Select Partners

Jun 27, 2026
MRAgent Cuts Token Use to 118K per Query – LangMem Burns 3.26M

MRAgent Cuts Token Use to 118K per Query – LangMem...

Jun 27, 2026
OpenAI upgrades GPT-5.5 Instant with sharper intent detection and richer shopping insights

OpenAI upgrades GPT-5.5 Instant with sharper intent detection and richer...

Jun 26, 2026

Join The Elite

Get the top 0.1% global intelligence and market insights delivered directly to your inbox before the masses.

We respect your privacy. No spam.