AI Intelligence

Self-Evolving AI Agents Gain Skill‑Writing Power Without Model Retraining

By Julian Reed • Published: April 9, 2026 • 2 MIN READ

2 Min Read

Self‑evolving AI agents can now modify their own skill set without touching the underlying language model, a breakthrough announced by a multi‑university team.

Self‑evolving AI agents rewrite skills via Memento‑Skills

The new Memento‑Skills framework treats skills as mutable markdown artifacts, each bundling a declarative spec, prompt guidance, and executable code. When a task fails, an orchestrator analyses the trace, rewrites the offending artifact, and validates the change with an auto‑generated unit test before committing it to the global library.

Unlike traditional retrieval‑augmented generation that leans on semantic similarity, Memento‑Skills employs a behavior‑focused router that selects the most effective tool, avoiding mismatches such as pulling a “password reset” script for a “refund processing” request.

“The true value of a skill lies in its contribution to the overall workflow,” said Jun Wang, co‑author, speaking to Reuters.

Continuous improvement follows a “Read‑Write Reflective Learning” loop: the agent fetches a skill, executes it, receives feedback, and then rewrites or creates new skills as needed. The router itself is refined through a one‑step offline reinforcement learning pass that rewards long‑term utility rather than mere textual overlap.

Benchmarks reveal the impact. On the GAIA suite, accuracy rose from 52.3% to 66.0%, a gain of ↑ 13.7%. On the Humanity’s Last Exam, performance jumped from 17.9% to 38.7%, a leap of ↑ 38.7%. In both cases, the skill library grew organically—from five seed routines to 41 and 235 distinct tools respectively.

Must Read Intel Review our latest briefing on this sector

Related Intel: Trump AI order: Voluntary Frontier Model Testing Opens

Enterprise teams see immediate value: the approach eliminates costly fine‑tuning and manual skill engineering, while the built‑in test gate curtails regression risk. However, Wang warns that the method shines in structured workflows where tasks share patterns; isolated or long‑horizon problems may still require multi‑agent orchestration.

Source code is publicly available on GitHub, and early adopters are encouraged to pilot within well‑defined process pipelines. For broader deployment, a robust governance layer—beyond unit tests—will be essential, a point highlighted by Bloomberg.

Intel provided by: Julian Reed
Consumer Electronics Expert

Analysis By Julian Reed

Senior Intel Analyst & Contributing Editor. Focused on deep-tier geopolitical and market strategies.