Built 26/04/17 09:39commit 8de3d61
Agent Harness Design
中文 | English
Summary
Agent harness design is the practice of adding just enough orchestration around a model to keep long-running work coherent, verifiable, and revisable, while continuously re-testing whether each piece of scaffolding is still necessary.
Core Patterns
- A planner expands short prompts into a richer product or task spec so the implementation agent does not under-scope the job.
- A generator performs the substantive build work, usually against explicit contracts or structured deliverables.
- An evaluator checks the output with independent criteria and tooling, producing actionable feedback instead of self-congratulatory review.
- Structured artifacts and handoff files preserve state across long runs, context resets, or agent boundaries; these can stay lightweight, such as session logs, PRDs, codemaps, or review notes.
- A minimal monolithic loop can be a legitimate harness too: keep one fresh-context agent focused on one important item per pass, and externalize memory into repo artifacts instead of adding premature orchestration.
- Repository legibility is part of the harness, not a separate concern: plans, docs, tools, and review loops all shape what the agent can reliably do.
- Subagents and session controls help only when the work decomposes cleanly; otherwise they add token and coordination overhead faster than they add leverage.
Evaluation Lessons
- Self-evaluation is systematically lenient; separate evaluators are easier to calibrate toward skepticism.
- Evaluation works better when subjective judgments are translated into concrete criteria.
- Interactive verification tools matter because screenshots and static inspection miss behavioral defects.
- Verification loops get stronger when cheap graders, transcript reading, pass-rate metrics, and human review are treated as distinct layers instead of one all-purpose check.
- When older benchmarks saturate, open-ended real-world tasks plus a final confirmation pass become a better capability readout than repeated replay exercises.
- High-throughput agent teams often converge on many small PRs, squash-heavy merge policies, and explicit review agents because the merge queue becomes the real bottleneck before raw implementation speed does.
Simplification Heuristic
- Every harness component encodes an assumption about what the base model cannot yet do well.
- As models improve, previously essential constructs such as sprint decomposition or repeated QA passes may become unnecessary overhead.
- Simplification should be methodical: remove one component at a time and inspect what quality or reliability was lost.
Environment Design Lessons
- Harness quality depends on how much of the product and runtime are directly legible to the agent.
- Short routing documents plus indexed deeper documentation scale better than one giant instruction file.
- Architecture boundaries, custom lints, and repo-local plans are part of the control system that keeps autonomous work coherent.
- Hooks, commands, and reusable skills are also part of the environment layer because they can enforce repeated checks without making every prompt longer.
- Model upgrades can require harness retuning too: stronger literal instruction following, better file-system memory, higher-resolution vision, and new effort or review controls all change how prompts, budgets, and verification loops should be set.
- Harnesses can also be shipped as reusable repository bundles with scripts, skills, and plugin metadata, not just reconstructed from prose guidance.
- Hosted meta-harnesses are a real design option too: some teams should own only the task contract and environment policy, while buying the loop, session durability, and tool runtime as managed primitives.
- Team-facing managed-agent platforms add another environment layer beyond repo-local scaffolding: issue boards, daemon-attached runtimes, runtime routing, and assignable agent identities can all become part of the harness surface that coordinates long-running work.
Interface Design Lessons
- Stable interfaces can outlast one concrete harness implementation, just as operating-system abstractions outlast hardware generations.
- Session durability should be modeled separately from the model's active context window, so recovery and context management do not collapse into one irreversible mechanism.
- Decoupling the brain, hands, and session makes failure recovery, security boundaries, and scaling behavior easier to reason about than a single all-in-one container.
- Treat credentials as structurally outside the sandbox where generated code runs; this is stronger than assuming the model will always respect narrower scopes.
When This Matters
- Long-running coding tasks where coherence degrades over time.
- Subjective domains such as design, where quality must be made gradable.
- Product builds that need both ambitious planning and skeptical final verification.
- Repositories where agents are expected to open, review, and merge changes with limited human intervention.
Sources
- Anthropic Harness Design For Long-Running Application Development
- Affaan Mustafa Claude Code Shorthand Guide
- Affaan Mustafa Claude Code Longform Guide
- Claude Mythos Preview Cybersecurity Assessment
- Claude Mythos Preview System Card
- Claude Managed Agents Overview
- Scaling Managed Agents: Decoupling The Brain From The Hands
- OpenAI Harness Engineering In An Agent-First World
- Codex Best Practices
- Codex Subagents
- Ralph Wiggum Loop Technique
- Ralph Step-By-Step Guide
- Ralph GitHub Repository
- Karpathy Claude Coding Thread
- Forrestchang Andrej Karpathy Skills Repository
- gstack
- Claude Code Best Practice Tips Compendium
- Multica
- Introducing Claude Opus 4.7