type: rule
status: active
timestamp: 2026-06-28
tags: [sub-agents, token-reduction, delegation, agent-behavior, hard-rule]

Delegate to sub-agents by default — researcher for reads, Haiku for batch

ACTIVE every response. Use sub-agent before reading 3+ files. Isolated context; only summary returns. Cuts tokens 40-70%

Delegate to sub-agents by default

ACTIVE EVERY RESPONSE. Token-reduction discipline at the orchestration layer.

The rule

For any of these patterns, dispatch a sub-agent instead of doing it in the main thread:

PatternSub-agentWhy
Reading 3+ files to answer a questionresearcherPinned to Haiku — 5× cheaper. Returns paragraph summary.
grep / Glob across the repo for a symbolresearcherSame
”Where is X defined?” / “What calls Y?”researcher or ExploreRead-only, fast
Multi-step build (scaffold + commit + deploy)general-purposeTool calls don’t bloat main context
Architecture planning across 5+ filesPlanReturns plan only, not file dumps
Claude Code / Anthropic API Q&Aclaude-code-guideHas WebFetch + docs access

When to skip sub-agents (stay in main thread)

Output discipline

When delegating, the sub-agent prompt MUST specify:

  1. What success looks like (one-line goal)
  2. Return format (terse summary, paragraph not raw dump)
  3. Working dir (absolute path, never /tmp on Windows Git Bash — use C:/D/oriz/.staging/<task>/)
  4. Hard constraints (no branches, no emoji, ponytail/caveman active)

Anti-patterns

The cost math

Reading 10 files of 200 lines each in main thread: ~20K input tokens consumed.

Delegating to researcher: ~500 tokens out (the prompt) + ~300 tokens back (the summary). ~95% savings on that operation, plus the sub-agent runs on Haiku at 1/5 the price.

Cross-refs


Edit on GitHub · Back to index