Agent Harness Design

中文 | English

Summary

Agent harness design is the practice of adding just enough orchestration around a model to keep long-running work coherent, verifiable, and revisable, while continuously re-testing whether each piece of scaffolding is still necessary.

Core Patterns

A planner expands short prompts into a richer product or task spec so the implementation agent does not under-scope the job.
A generator performs the substantive build work, usually against explicit contracts or structured deliverables.
An evaluator checks the output with independent criteria and tooling, producing actionable feedback instead of self-congratulatory review.
Structured artifacts and handoff files preserve state across long runs, context resets, or agent boundaries; these can stay lightweight, such as session logs, PRDs, codemaps, or review notes.
A minimal monolithic loop can be a legitimate harness too: keep one fresh-context agent focused on one important item per pass, and externalize memory into repo artifacts instead of adding premature orchestration.
Repository legibility is part of the harness, not a separate concern: plans, docs, tools, and review loops all shape what the agent can reliably do.
Subagents and session controls help only when the work decomposes cleanly; otherwise they add token and coordination overhead faster than they add leverage.

Evaluation Lessons

Self-evaluation is systematically lenient; separate evaluators are easier to calibrate toward skepticism.
Evaluation works better when subjective judgments are translated into concrete criteria.
Interactive verification tools matter because screenshots and static inspection miss behavioral defects.
Verification loops get stronger when cheap graders, transcript reading, pass-rate metrics, and human review are treated as distinct layers instead of one all-purpose check.
When older benchmarks saturate, open-ended real-world tasks plus a final confirmation pass become a better capability readout than repeated replay exercises.
High-throughput agent teams often converge on many small PRs, squash-heavy merge policies, and explicit review agents because the merge queue becomes the real bottleneck before raw implementation speed does.

Simplification Heuristic

Every harness component encodes an assumption about what the base model cannot yet do well.
As models improve, previously essential constructs such as sprint decomposition or repeated QA passes may become unnecessary overhead.
Simplification should be methodical: remove one component at a time and inspect what quality or reliability was lost.

Environment Design Lessons

Harness quality depends on how much of the product and runtime are directly legible to the agent.
Short routing documents plus indexed deeper documentation scale better than one giant instruction file.
Architecture boundaries, custom lints, and repo-local plans are part of the control system that keeps autonomous work coherent.
Hooks, commands, and reusable skills are also part of the environment layer because they can enforce repeated checks without making every prompt longer.
Model upgrades can require harness retuning too: stronger literal instruction following, better file-system memory, higher-resolution vision, and new effort or review controls all change how prompts, budgets, and verification loops should be set.
Harnesses can also be shipped as reusable repository bundles with scripts, skills, and plugin metadata, not just reconstructed from prose guidance.
Hosted meta-harnesses are a real design option too: some teams should own only the task contract and environment policy, while buying the loop, session durability, and tool runtime as managed primitives.
Team-facing managed-agent platforms add another environment layer beyond repo-local scaffolding: issue boards, daemon-attached runtimes, runtime routing, and assignable agent identities can all become part of the harness surface that coordinates long-running work.

Interface Design Lessons

Stable interfaces can outlast one concrete harness implementation, just as operating-system abstractions outlast hardware generations.
Session durability should be modeled separately from the model's active context window, so recovery and context management do not collapse into one irreversible mechanism.
Decoupling the brain, hands, and session makes failure recovery, security boundaries, and scaling behavior easier to reason about than a single all-in-one container.
Treat credentials as structurally outside the sandbox where generated code runs; this is stronger than assuming the model will always respect narrower scopes.

When This Matters

Long-running coding tasks where coherence degrades over time.
Subjective domains such as design, where quality must be made gradable.
Product builds that need both ambitious planning and skeptical final verification.
Repositories where agents are expected to open, review, and merge changes with limited human intervention.

claude-code

everything-claude-code

claude-mythos-preview

multica-ai

piebald-ai

voltagent

awesome-design-md

design-md

codex

skills

llm-wiki

anthropic

claude-code

everything-claude-code

claude-mythos-preview

cocoon-ai

dbreunig

everyinc

garrytan

github

multica-ai

piebald-ai

voltagent

karpathy

openai

codex

configuration

skills

ralph

shanraisshan

Agent Harness Design

Summary

Core Patterns

Evaluation Lessons

Simplification Heuristic

Environment Design Lessons

Interface Design Lessons

When This Matters

Sources

everything-claude-code

awesome-design-md

design-md

skills

llm-wiki

claude-code

everything-claude-code

claude-mythos-preview

multica-ai

piebald-ai

voltagent

codex

configuration

skills

Agent Harness Design ​

Summary ​

Core Patterns ​

Evaluation Lessons ​

Simplification Heuristic ​

Environment Design Lessons ​

Interface Design Lessons ​

When This Matters ​

Sources ​

Related Pages ​

Agent Harness Design

Summary

Core Patterns

Evaluation Lessons

Simplification Heuristic

Environment Design Lessons

Interface Design Lessons

When This Matters

Sources

Related Pages