← Back to omkarray.com
Agent Architecture · Tool Design · February 2026

Seeing Like an Agent

A first-principles walkthrough of tool design, action spaces, and progressive disclosure — from the lessons of building Claude Code.

One of the hardest parts of building an agent harness is constructing its action space — the set of tools and capabilities the model can invoke. Too few tools and the agent is crippled. Too many and it's paralyzed by choice.

"You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment."
— Claude Code Team, Anthropic
01
Foundation

The Tool Power Spectrum

LOW POWER HIGH POWER Paper Manual only Calculator Powerful but specific Computer Code execution HIGHEST CEILING plain text output no tools at all structured tools (JSON in, JSON out) bash / code execution unlimited composability SKILL REQUIREMENT INCREASES →
fig. 1 — power vs. skill tradeoff in tool design

Paper = No Tools

The model just outputs text. It can reason but can't act on the world. Like solving math with just pen and paper.

Calculator = Defined Tools

Structured tool calls (search, file_read, API). Powerful but each call round-trips through context — the "composition tax."

Computer = Code Execution

Bash, Python, or PTC. The agent writes code that orchestrates tools directly. Highest ceiling, but the model must know how to code.

02
The Core Problem

The Composition Tax

TRADITIONAL TOOL CALLING Claude Tool Call 1 full result → context REASONING Tool Call 2 full result → context REASONING Tool Call 3 + LATENCY + TOKENS + REASONING

Every tool call pays a tax

In traditional tool calling, each action round-trips through Claude's context window. The result gets serialized (even thousands of rows when you only need five), triggers a new reasoning step, and adds latency.

With 3 sequential tool calls, you're paying 3× the latency, 3× the context bloat, and 3 full reasoning steps.

"The composition tax grows with the number of actions. This is the fundamental tension in tool design."
TAKEAWAY → This is why "just add more tools" doesn't scale. Each tool adds cognitive and computational overhead.
03
Programmatic Tool Calling

Compose in Code, Not in Context

PROGRAMMATIC TOOL CALLING (PTC) Claude writes code CODE EXECUTION CONTAINER await search(q1) result → code filter(results) await search(q2) return summary ONLY FINAL OUTPUT NO ROUND TRIPS

Code as the orchestration layer

With PTC, Claude writes code that calls tools as functions inside a container. Intermediate results stay in the code — they never bloat the context window.

The container pauses when a tool is invoked, the call crosses the sandbox boundary, gets fulfilled externally, and the result returns to the running code.

"Rather than pulling 50 raw search results into context, the code can parse, filter, and cross-reference results programmatically. This keeps what's relevant and discards the rest."
RESULT → +11% accuracy, −24% input tokens. Opus 4.6 with PTC is #1 on LMArena Search Arena.
04
Elicitation Design

Finding the Sweet Spot

NO STRUCTURE TOO RIGID Modified Markdown Output model free but messy, hard to format ✗ ATTEMPT 2 AskUserQuestion Tool structured + composable, clear UI surface ✓ SWEET SPOT ExitPlanTool Parameter plan already formed, questions come too late ✗ ATTEMPT 1 KEY INSIGHT: EVEN THE BEST TOOL FAILS IF THE MODEL DOESN'T LIKE CALLING IT
fig. 4 — three attempts at elicitation in claude code

The Claude Code team tried three approaches to get Claude to ask better clarifying questions. Modified markdown was too loose (Claude broke format). ExitPlanTool parameter was too rigid (questions came after the plan was already made). The AskUserQuestion tool hit the sweet spot — structured enough for reliable UI, flexible enough that Claude actually liked calling it.

· · ·
05
Capability Drift

Tools That Helped Become Constraints

WEAKER MODEL STRONGER MODEL UTILITY TodoWrite + 5-turn reminders PEAK UTILITY Task Tool inter-agent coordination CROSSOVER: REPLACE TOOL CONSTRAINT ZONE

The TodoWrite → Task evolution

Early Claude needed a Todo list + reminders every 5 turns to stay on track. But as models improved, the reminders became a cage — Claude felt it had to stick to the list instead of adapting.

Opus 4.5 got better at subagents, but TodoWrite couldn't handle inter-agent coordination. The Task Tool replaced it with dependencies, shared updates, and deletable tasks.

"As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions."
TAKEAWAY → Schedule regular "tool audits." What helped your weak model may be hurting your strong one.
06
Search Design

From RAG to Self-Built Context

EVOLUTION OVER 1 YEAR RAG Database context given to Claude fragile, needs indexing PASSIVE Grep Tool Claude searches itself builds own context ACTIVE SEARCH Agent Skills recursive file discovery files reference files PROGRESSIVE DISCLOSURE Nested Multi-Layer API docs, DB queries, skill → skill → file FULL AUTONOMY Key Shift: Don't give the agent context. Give it tools to find context. As Claude gets smarter, it becomes increasingly good at building its own context.
fig. 6 — context acquisition evolution in claude code
07
Key Pattern

Progressive Disclosure

ADD CAPABILITY WITHOUT ADDING A TOOL System Prompt contains link → SKILL.md references → api_docs.md db_schema.md endpoints auth.md tables queries Context Window only loads what's needed, when needed

The tree, not the library

Instead of stuffing all knowledge into the system prompt or a RAG index, progressive disclosure lets the agent explore a tree of files. Each file references others. The agent only loads what's relevant.

This is how Claude Code added self-documentation without adding a tool. A "Guide" subagent follows links, searches docs, and returns just the answer.

"Progressive disclosure is now a common technique we use to add new functionality without adding a tool."
RULE → Claude Code has ~20 tools. The bar to add a new one is high. Before adding tool #21, ask: can progressive disclosure handle this?
08
Decision Framework

When to Add a Tool vs. Not

Does the agent need a new capability? NO Don't touch YES Does it need to be caught by the harness? NO Progressive Disclosure skills, subagents, linked docs YES Does it need guardrails or special UI rendering? YES Dedicated Tool file_edit, AskUser NO Bash / PTC code handles it
fig. 8 — decision tree for adding tools to your agent
· · ·
09
Cognitive Load

The 20-Tool Ceiling

NUMBER OF TOOLS AGENT PERFORMANCE SWEET SPOT ~20 CHOICE PARALYSIS too few = limited too many = confused

More tools ≠ more capability

Claude Code operates with ~20 tools. Each additional tool gives the model one more option to reason about — increasing decision complexity and the chance of misuse.

Before adding tool #21, the team asks: can this be handled by progressive disclosure (skills, docs, subagents) instead?

"The bar to add a new tool is high, because this gives the model one more option to think about."
ANALOGY → Think of portfolio diversification: the 21st holding rarely improves a well-constructed portfolio. Same with tools — quality of curation beats raw quantity.
10
Complete Mental Model

The Agent Action Space

Claude reasoning core Bash File Edit GUARDRAILED Grep context building AskUser elicitation Task coordination Skills linked docs Subagents guide agent PTC code execution COMPOSABILITY GUARDRAILS CONTEXT BUILDING USER INTERACTION
fig. 10 — claude code's complete action space architecture

Every tool in Claude Code's ~20-tool set serves one of four roles: composability (bash, PTC for code execution), guardrails (file_edit with staleness checks), context building (grep, skills, subagents), or user interaction (AskUser for elicitation). Progressive disclosure — skills, subagents, linked docs — extends the action space without adding tools.