AI Intelligence

MRAgent Cuts Token Use to 118K per Query – LangMem Burns 3.26M

By Dr. Aris Thorne • Published: June 27, 2026 • 2 MIN READ

2 Min Read

MRAgent emerges from NUS as a game‑changing agentic memory architecture that trims token consumption to ↑ 118k per query, while competing systems such as LangMem exhaust ↓ 3.26M tokens.

MRAgent’s active memory reconstruction

Traditional retrieval pipelines flood large language models with irrelevant data; they cannot adjust queries mid‑reasoning, leading to costly context overload. By treating memory as an interactive graph of Cues, Tags and Content, the framework lets the LLM iteratively prune branches and chase promising evidence, mirroring human associative recall.

How the Cue‑Tag‑Content graph works

The agent extracts fine‑grained cues from a user prompt, follows linked tags that summarize semantic relations, and only then pulls the full content blocks, saving compute and prompt tokens. In a sample query about “Nate’s prize‑money use after his third tournament win,” MRAgent isolates cues (“Nate,” “tournament,” “win”), discards irrelevant participation tags, and homes in on three episodic memories, ultimately delivering a concise answer.

“The on‑demand pruning of irrelevant paths is what drives our cost advantage,” researchers noted.

Benchmarks on LoCoMo and LongMemEval show MRAgent outperforming A‑MEM, MemoryOS, LangMem and Mem0 across Gemini 2.5 Flash and Claude Sonnet 4.5, halving runtime from 1,122 seconds to 586 seconds. Enterprises looking to adopt the system must set up an automated ingestion pipeline that distills raw interaction logs into the graph; the authors provide an open‑source GitHub repo and recommend a background job feeding data through LLM‑driven prompts. For further industry perspective, see Reuters coverage of emerging LLM memory solutions.

Must Read Intel Explore deeper: OpenAI Rolls Out Limited GPT-5.6 Preview for Select Partners

Words by Dr. Aris Thorne (Artificial Intelligence Researcher).

Analysis By Dr. Aris Thorne

Senior Intel Analyst & Contributing Editor. Focused on deep-tier geopolitical and market strategies.