Agent Design Patterns

2025 ended with Meta buying Manus for $2B+ and Claude Code reaching a $1B run rate. Agent task length doubles every 7 months. But there's a fundamental constraint: models degrade as context grows. Chroma calls it "context rot" — every token depletes an LLM's finite attention budget.

Every design pattern in modern agents is ultimately a strategy for managing this scarce resource. The context window is small and expensive; the filesystem is large and cheap. The best agents constantly shuttle information between the two.

Exhibit A — The Context Degradation Problem

Model Performance Token Count →

Good: Focused context

Degrading: Bloated tools + history

Context rot: Lost in noise

0 tokens 50K 100K 200K+

Source: Chroma "Context Rot" research + Anthropic context engineering guidelines. Performance = task accuracy on multi-step agentic benchmarks.

Exhibit B — Where Context Budget Gets Consumed

Tool Defs

GitHub MCP: ~26K tokens

~26K

System Prompt

Instructions + guardrails

~8K

Chat History

Grows linearly per turn

~35K+

Tool Results

File reads, search results, API responses

~45K+

Actual Task

What matters

~8K

The irony: the actual task-relevant information is often <15% of the context window. Every pattern below exists to improve this ratio.

Lance Martin identified seven recurring design patterns across Claude Code, Manus, Cursor, Amp Code, and others. Each targets a different aspect of the context management problem. Together, they form a coherent playbook.

Exhibit C — The Seven Patterns at a Glance

💻

Give Agents a Computer

Filesystem + shell = persistent context + unlimited action space. The agent "lives on your OS."

Foundation

🎲

Multi-Layer Action Space

Few atomic tools (~12–20), push everything else to code execution via shell/bash.

Token Savings

👁

Progressive Disclosure

Show only essential info upfront. Reveal tool definitions, docs, and skills on demand.

Context Efficiency

📦

Offload Context

Write tool results and plans to files. Read back only when needed. Avoids lossy summarization.

Memory Extension

⚡

Cache Context

Prompt caching is the "most important metric." Higher-cost model + caching beats cheaper model without.

Cost Control

🕵

Isolate Context

Sub-agents with own context windows for parallel tasks. The "Ralph Wiggum" loop for long-running work.

Scalability

🧠

Evolve Context

Learn from past sessions. Distill experiences into memories, update prompts, save reusable skills.

Continual Learning

These patterns aren't independent — they form layers of a system. The foundation is computer access. On top of that, you build action management, context management, and finally learning. Each layer addresses a different failure mode.

Exhibit D — The Agent Architecture Stack (Bottom = Foundation, Top = Intelligence)

Layer 6: Evolve Context

Memories · Skills · Diary entries → updated prompts

LEARNING

Layer 5: Isolate Context

Sub-agents · Ralph Wiggum loops · Map-reduce

SCALE

Layer 4: Cache Context

Prompt caching · Cache hit rate optimization

COST

Layer 3: Offload + Progressive Disclosure

Write results to files · Reveal tools on demand · Skills YAML

CONTEXT MGMT

Layer 2: Multi-Layer Action Space

~12 atomic tools → bash → CLIs → code execution (CodeAct)

ACTIONS

Layer 1: Give Agents a Computer

Filesystem + Shell + Persistent State

FOUNDATION

Each layer depends on the one below it. You can't offload context without a filesystem. You can't isolate sub-agents without a shell. The foundation is computer access.

Different agents have adopted different subsets of these patterns. Claude Code and Manus are the most complete implementations. Cursor Agent pioneered progressive disclosure for MCP. No agent fully implements all seven yet.

Exhibit E — Pattern Adoption Matrix

Pattern	Claude Code	Manus	Cursor Agent	Amp Code	Devin
Computer Access	✓ Local OS	✓ Virtual	✓ IDE + Shell	✓ Local OS	✓ Cloud VM
Multi-Layer Actions	✓ ~12 tools	✓ <20 tools	● ~30+ tools	✓ Curated few	● Medium set
Progressive Disclosure	● Skills	✓ CLI --help	✓ MCP folder sync	○	○
Offload Context	● Via filesystem	✓ Files + summary	✓ Trajectory files	○	● Workspace files
Cache Context	✓ Essential	✓ Top metric	✓	✓	✓
Isolate Context	✓ Sub-agents	●	✓ Task agents	●	✓ Multi-agent
Evolve Context	● CLAUDE.md	○	● Rules + Skills	○	● Memory

    ✓ Full implementation
    ● Partial / emerging
    ○ Not yet
  

In a well-designed agent, context doesn't just accumulate linearly. It flows through a lifecycle: loaded on demand, used for the current step, offloaded to the filesystem, cached for reuse, and eventually distilled into persistent learnings.

Exhibit F — Context Flow in a Modern Agent

📄

Task Arrives

User request enters the system prompt

→

🔍

Disclose

Load only relevant tools, skills, files on demand

→

⚙

Execute

Use ~12 atomic tools; push actions to shell/code

→

📦

Offload

Write results to files; keep context window lean

→

🧠

Learn

Distill into memories, update CLAUDE.md / skills

For Sub-Tasks

Spawn sub-agents with isolated context. Each gets a focused slice. Communicate via git history or plan files. The "Ralph Wiggum" loop.

For Cost Control

Prompt caching allows resuming from a prefix. Manus says cache hit rate is their #1 metric. Append-only history preserves cache. Don't mutate.

For Long-Running Tasks

Write a plan to a file. Re-read periodically to reinforce objectives. Use stop hooks to verify work. Track progress via git.

The key insight: context is not static. It's a managed resource with a lifecycle — loaded, used, offloaded, cached, and evolved.

Exhibit G — Multi-Layer Action Space: The Tool Hierarchy

Layer 3: Code

Write & execute scripts Chain complex actions Process intermediate results locally ∞ actions · 0 tokens

↑ pushed down from

Layer 2: Shell

Built-in utilities Installed CLIs Package managers Git operations 100s of commands · 0 tool tokens

↑ pushed down from

Layer 1: Tools

        Read file
        Write file
        Bash / Shell
        Search
        Browser
        ~12 tools · ~2K tokens in defs
      

Claude Code uses ~12 tools. Manus uses <20. The key: a single bash tool can access hundreds of CLI commands without loading their definitions into context. CodeAct showed agents can chain complex actions through code execution, saving token budget.

Exhibit H — Progressive Disclosure: Three Strategies Compared

Tool Indexing

Index tool definitions separately. Give agent a search tool to find and load specific tools on demand.

Used by: LangGraph BigTool, Anthropic's advanced tool use pattern.

Saves: 10–30K tokens when MCP servers have 30+ tools. Agent loads only 1–3 tool defs per step instead of all 35.

CLI Help Flags

List available utilities in instructions. Agent calls --help on any CLI it needs to learn about.

Used by: Manus. Agent instructions mention available CLIs but don't load their full docs.

Saves: Unbounded — hundreds of CLI tools available at zero upfront token cost.

Skills / YAML Frontmatter

Each skill has a short summary in YAML. Agent reads full SKILL.md only when the task matches.

Used by: Anthropic's Skills standard, Cursor Agent's MCP folder sync.

Saves: Only loads ~50 bytes per skill upfront vs. full knowledge base. Agent self-selects what to read.

Exhibit I — The Ralph Wiggum Loop: Context Isolation for Long-Running Agents

Step 1

Initializer Agent

Creates plan file + tracking file + environment setup

↓

Step 2 · Repeating Loop

Sub-Agent A

Fresh context window

Reads plan file

Completes task 1

Sub-Agent B

Fresh context window

Reads plan file

Completes task 2

Sub-Agent N

Fresh context window

Reads plan file

Completes task N

↻

↓

Step 3

Stop Hook Verification

Check plan file completion status. If incomplete, loop. Progress tracked via git history.

Named by Geoffrey Huntley. Used by Claude Code with stop hooks. Each sub-agent gets a fresh context window — the filesystem is the shared memory. Anthropic calls this the "effective harness for long-running agents."

Exhibit J — Pattern Maturity Assessment

Pattern	Maturity	Current State	Open Question
Computer Access	MATURE	Universal across top agents. Local or virtual machine. Table stakes for 2026.	Security boundaries — how much OS access is safe?
Multi-Layer Actions	MATURE	CodeAct validated. Claude Code, Manus, Amp all use ~12–20 tools + shell.	Optimal tool count? Still hand-tuned per agent.
Progressive Disclosure	EMERGING	Cursor leads with MCP folder sync. Anthropic Skills standard gaining traction.	Standards for tool discovery? Who decides what's relevant?
Offload Context	EMERGING	Manus writes to files. Cursor offloads trajectories. Better than summarization.	When to offload vs. summarize vs. drop? No consensus.
Cache Context	MATURE	Prompt caching is universal. Manus optimizes cache hit rate as primary metric.	How to mutate context without breaking cache prefixes?
Isolate Context	EMERGING	Ralph Wiggum loop works. Sub-agents for parallel review. Gas Town for swarms.	Coordination protocols for multi-agent conflict resolution?
Evolve Context	NASCENT	CLAUDE.md updates, diary entries, skill distillation. Mostly manual today.	Can agents automatically learn what to remember?

The seven patterns are the current state of the art. But three fundamental challenges remain unsolved. Each could reshape agent architecture over the next 12–18 months.

Exhibit K — The Three Open Frontiers

Frontier 1 · Learned Context Management

The Bitter Lesson prediction: compute scaling overtakes hand-crafted scaffolding. Models may learn to manage their own context — when to summarize, what to offload, which sub-agents to spawn.

Recursive Language Models (RLM) Prime Intellect research Sleep-time compute Self-reflecting memories

"Much of the prompting or scaffolding packed into agent harnesses might get absorbed by models."

Frontier 2 · Multi-Agent Coordination

Parallel agents make conflicting decisions. No shared visibility. Can't engage in the proactive discourse humans use.

Gas Town (Yegge) Mayor agent pattern Git merge queues

Frontier 3 · Long-Running Agent Infra

No standards for agent observability, debugging interfaces, or human-in-the-loop monitoring. Current patterns are rough.

Stop hooks Git-based tracking Agent observability

"Imagine if agents automatically reflect over their past sessions and use this to update their own memories or skills. Also, agents might directly reflect over their own memories accumulated over time to better consolidate them and prepare for future tasks, just as we do." — Lance Martin

The Meta-Pattern

Core Principle

Filesystem
= Memory

The context window is small and expensive. The filesystem is large and cheap. The best agents shuttle information between the two.

Pattern Count

Distinct strategies for managing context — from computer access to continual learning. Together they form the playbook.

Horizon

12–18mo

Before models may learn to manage their own context. The Bitter Lesson suggests hand-crafted scaffolding gets absorbed.

What's Working Now

Give agents a computer. Use ~12 tools + shell for unlimited actions. Offload context to files. Cache aggressively. Isolate sub-agents for parallel work. These are battle-tested and universal across top agents in early 2026.

What's Still Unsolved

Automatic context evolution (memories that update themselves). Multi-agent conflict resolution (parallel agents stepping on each other). Agent observability standards (we can't debug what we can't see). The next wave of agent infrastructure will target these.

Key People & References

Lance Martin — LangChain, original blog post
Andrej Karpathy — "Context engineering" framing
Barry Zhang & Erik Schluntz — Anthropic agent definition
Peak Ji — Manus action space hierarchy
Boris Cherny — Claude Code internals
Geoffrey Huntley — "Ralph Wiggum" loop
Steve Yegge — Gas Town multi-agent coordination

Key Papers & Posts

CodeAct — Agents chaining actions via code execution
Context Rot — Chroma research on degradation
RLM — Recursive Language Models (Prime Intellect)
Sleep-Time Compute — Offline agent reflection
GEPA — Evolving task-specific prompts
Anthropic Skills Standard — Progressive disclosure
Manus Context Engineering — Cache hit rate optimization

Agent DesignPatterns

Give Agents a Computer

Multi-Layer Action Space

Progressive Disclosure

Offload Context

Cache Context

Isolate Context

Evolve Context

Tool Indexing

CLI Help Flags

Skills / YAML Frontmatter

Core Principle

Pattern Count

Horizon

What's Working Now

What's Still Unsolved

Key People & References

Key Papers & Posts

Agent Design
Patterns