Your Repo Needs an Agent Harness, Not More Prompt Paste

Outcome focus: Defined a repo documentation harness that separates human orientation, always-loaded agent rules, tool-specific compatibility files, on-demand skills, dynamic docs, and deterministic enforcement.

The repository is becoming the prompt.

Not the chat prompt. The durable prompt. The one that survives session resets, new agents, new teammates, context compaction, and the tenth time somebody asks an AI coding tool to "just fix the build."

That is why README.md, AGENTS.md, CLAUDE.md, SKILL.md, .agents/skills, .claude/skills, .claude/rules, MCP config, and llms.txt documentation indexes suddenly matter.

They are not random Markdown clutter.

They are the harness around agentic development.

But a harness can help or hurt. A clear one gives agents the right context at the right time. A messy one turns the repo into a pile of stale rules, duplicated instructions, over-triggered skills, and false confidence. The hot mistake right now is treating every new agent file as a magic control surface.

It is not magic. It is context engineering plus workflow design plus enforcement.

I read this repo against the current agent-doc ecosystem before writing this: AGENTS.md, OpenAI's Codex AGENTS.md guide, OpenAI Agent Skills, the Agent Skills spec, Claude Code's skills docs, Claude's CLAUDE.md guidance, Claude Code's overview and common workflows, Google's ADK notes on coding with AI and building ADK agents with skills, the openai/skills catalog, awesome-claude-skills, AgentSkills.io, the Claude Code cheat sheet, the Daily Dose of Data Science writeup on the .claude/ folder, community discussion around those resources, and Bibek Poudel's Medium post on the SKILL.md pattern.

The sources do not all agree in tone, and some community posts overstate what Markdown can enforce. That disagreement is useful. It makes the real architecture visible.

The Repo Audit#

This site already has the right bones:

README.md explains the product, audience, site structure, content model, local workflow, analytics, release readiness, and repository pointers.
AGENTS.md gives repo-specific commands, npm-only policy, dev workflow, and PR workflow.
CLAUDE.md is a compatibility shim that points Claude Code at AGENTS.md.
.agents/skills/ contains reusable workflow skills: Netlify forms, property bootstrap, legal route audit, pre-launch verification, and feature-completeness verification.

That is better than most repos because it separates product documentation from agent workflow documentation from task playbooks.

The audit also found useful friction:

AGENTS.md is clean but thin. It names commands, but it does not yet encode editorial gates, content-post expectations, protected paths, or when to run npm run build.
README.md is strong for humans, but one command drifted: it lists npm run type-check, while package.json uses npm run typecheck.
CLAUDE.md says "update both together," but the file is a pointer, not a sync mechanism. That is easy to forget.
.agents/skills/ is useful, but there is no root skill index explaining which skills exist, when to create new ones, and which skills are synced from a canonical workspace source.
The skills themselves are strong because they contain "when to use," "when not to use," procedure, verification, and relative references. That is the right shape.

The lesson is not "this repo is bad." The lesson is that agent docs become operational infrastructure. Small drift matters because the agent will execute the drift.

The Mental Model#

The source ecosystem points to one pattern:

The agent harness separates orientation, standing rules, tool compatibility, on-demand procedures, dynamic docs, and enforcement.

The file types should not all say the same thing.

Each layer has a job.

`README.md`: The Human-Agent Manifest#

README.md is still for humans first.

The AGENTS.md site explicitly frames AGENTS.md as a complement to README, not a replacement. That is the right boundary. A README should help a human decide what the repo is, why it exists, how to run it, and where to learn more. It can also route agents toward deeper files.

A good README now has two audiences:

the human who needs product and setup context,
the agent that needs enough high-level telemetry to choose the right docs.

For this repo, the README already does the human job well. It explains product intent, audiences, content model, privacy, local commands, quality checks, and release readiness.

The improvement I would make across repos is adding an "Agent Pointers" block:

README.md

## Agent Pointers
 
- Start with `AGENTS.md` for repo commands, guardrails, and verification.
- Use `.agents/skills/` for repeatable workflows.
- Do not treat README commands as authoritative if they conflict with `AGENTS.md`
  or `package.json`; verify scripts in `package.json`.
- For content changes, read `docs/editorial-style-guide.md` before editing posts.

That block keeps README human-readable while giving agents a routing map.

README should not carry every agent rule. If it does, humans stop reading it and agents get duplicate instructions.

`AGENTS.md`: The Always-Loaded Rule Layer#

AGENTS.md is the repo's standing operating contract for coding agents.

The open AGENTS.md format is intentionally plain Markdown. It recommends setup commands, tests, code style, security considerations, and nested files for subprojects. OpenAI's Codex guide is more specific: Codex reads AGENTS.md before work, layers global and project instructions, walks from repo root to current directory, and lets closer files override earlier guidance.

That means root AGENTS.md should be short, factual, and high-confidence.

Do not put a full SOP in it. Put the durable rules that should apply to nearly every task.

For this repo, I would evolve the root file toward:

AGENTS.md

# AGENTS.md
 
## Purpose
 
Repo-specific operating guide for coding agents and contributors. Prefer this
file over README when deciding commands, checks, and repo-specific constraints.
 
## Canonical Commands
 
- Install: `npm ci`
- Dev server: `npm run dev`
- Lint: `npm run lint`
- Typecheck: `npm run typecheck`
- Content validation: `npm run content:check`
- Tests: `npm run test`
- Full gate: `npm run test-all`
- Production build: `npm run build`
 
## Content Work
 
- Posts live in `src/content/posts`.
- Before adding or editing posts, read `docs/editorial-style-guide.md`.
- Use `Mermaid` and `Callout` directly in MDX; no imports are needed.
- Run `npm run content:check` after content changes.
- Run `npm run test-all` and `npm run build` before calling a post ready.
 
## Scope Rules
 
- Make the smallest diff that satisfies the request.
- Do not edit generated files or `node_modules`.
- Do not change dependencies unless the task explicitly asks.
- Do not rewrite unrelated content while adding a post.
 
## Verification
 
- Report commands run and whether they passed.
- If a command cannot run, explain why and name the risk.

Nested AGENTS.md files belong only where the rules actually differ. A monorepo with apps/web, apps/mobile, packages/data, and infra may need nested files. A small repo probably does not.

The critical rule: AGENTS.md should be boring. If it feels like a motivational poster, it is probably not helping the agent.

`CLAUDE.md`: Compatibility Shim or Claude-Specific Surface#

Claude Code has its own memory/instruction file story.

Anthropic's CLAUDE.md post says the file provides project-specific context and can live at repo, parent, or home scope. It recommends keeping the file concise, human-readable, and free of secrets because it becomes part of Claude's context. Claude Code's own product page emphasizes that it runs in the terminal, works with existing CLI tools, and asks permission before file changes or commands.

There are two good ways to use CLAUDE.md in a repo that already has AGENTS.md.

Option one: shim.

CLAUDE.md

# Project Instructions
 
Read `./AGENTS.md` first. Treat it as the canonical repository instruction file.
 
Claude-specific notes:
- Use Plan Mode for multi-file refactors.
- Use `.claude/skills/` only for project-specific Claude skills.
- Do not add secrets or personal preferences to this committed file.

Option two: tool-specific extension.

Use it only for Claude Code behavior that AGENTS.md cannot express: Plan Mode expectations, Claude-specific skill paths, project MCP servers, permission notes, or local .claude/rules.

The HN thread on .claude/ files surfaced a healthy warning: Markdown instructions increase the likelihood of behavior; they do not guarantee it. One commenter put the skepticism bluntly: if a rule matters, enforce it with code, permissions, hooks, or review.

That is the right posture.

CLAUDE.md is not a policy engine. It is context.

`.claude/rules` and Tool-Specific Rules#

Path-specific rules are useful when a directory has real local constraints.

Examples:

.claude/

.claude/
  rules/
    content-posts.md
    analytics-consent.md
    legal-routes.md
  settings.local.json

For this site:

.claude/rules/content-posts.md

---
paths:
  - "src/content/posts/**/*.mdx"
---
 
# Content Post Rules
 
- Read `docs/editorial-style-guide.md` before editing posts.
- Use frontmatter values defined in `src/types/content.ts`.
- Every `essay` needs scenario, tradeoff, failure/mistake, and artifact.
- Run `npm run content:check` after editing posts.

This is better than stuffing every content rule into root CLAUDE.md or root AGENTS.md. The rule loads where it is relevant.

GitHub Copilot, Cursor, Amazon Q, Gemini CLI, and other tools have their own rule surfaces. The cross-tool pattern is the same: keep the canonical repo contract in AGENTS.md, then use tool-specific files only for tool-specific loading behavior.

`SKILL.md`: The Action Layer#

Skills are not general project memory.

Skills are reusable procedures.

The Agent Skills overview defines a skill as a folder with a required SKILL.md file and optional scripts/, references/, and assets/. The specification requires YAML frontmatter with name and description, recommends focused body instructions, and encourages references for longer material. OpenAI's Codex skills docs say skills package instructions, resources, and optional scripts; Codex starts with name, description, and file path, then loads the full SKILL.md only when needed.

Claude Code's skills docs make the same practical point: unlike always-loaded CLAUDE.md, the body of a skill loads only when the skill is used. Claude also adds product-specific frontmatter such as disable-model-invocation, context: fork, model/effort controls, and tool pre-approval.

The important architecture is progressive disclosure:

L1: skill metadata,
L2: SKILL.md instructions,
L3: referenced files, scripts, and assets.

Google's ADK skills guide makes the token argument clearly: don't cram every checklist, API reference, and procedure into one monolithic system prompt. Load domain expertise only when the task calls for it.

That is the core reason skills matter.

The Description Is the Trigger#

Bibek Poudel's SKILL.md pattern gets one thing very right: when skills fail to trigger, the problem is often the description, not the body.

The description is not marketing copy. It is the matching surface.

Bad:

description: Helps with data.

Better:

description: |
  Use this skill when adding or changing BigQuery-backed feature tables,
  Vertex AI Feature Store feature views, point-in-time training exports, or
  online feature serving contracts. Produces source contracts, SQL checks,
  freshness rules, and model-serving verification steps.

The current repo skills already follow this pattern well. For example, add-netlify-form names concrete triggers such as "add a contact form," "wire up a Netlify form," and "add a waitlist form." That is good agent design.

My criticism is with vague role skills:

name: senior-engineer
description: Think like a senior engineer.

That is not a skill. That is a costume.

A real skill should answer:

when does it trigger,
what input does it need,
what steps does it run,
what output does it produce,
how is the result verified,
what is out of scope.

A Skill Template Worth Reusing#

This is the skeleton I would use across repos:

.agents/skills/example-task/SKILL.md

---
name: example-task
description: |
  Does one specific repeatable workflow. Use when the user asks to "trigger
  phrase one", "trigger phrase two", or "trigger phrase three". Produces a
  named artifact and verification report. Do not use for adjacent workflow X.
license: MIT
metadata:
  owner: platform
  version: "1.0.0"
---
 
# example-task
 
## When to use
 
- concrete fit 1
- concrete fit 2
 
## When NOT to use
 
- adjacent task that looks similar but has different risk
- production action that requires human approval
 
## Before you start
 
Read:
 
- [rules/contract.md](rules/contract.md)
- [references/examples.md](references/examples.md)
 
Verify:
 
- required file exists
- required command is available
- working tree state is understood
 
## Procedure
 
### 1. Inspect current state
 
Use file reads and search. Do not assume from chat context.
 
### 2. Make the smallest valid change
 
Follow the local pattern. Do not introduce a new abstraction unless needed.
 
### 3. Verify
 
Run the narrowest check first, then the required gate.
 
## Reporting format
 
Return:
 
- files changed,
- checks run,
- pass/fail result,
- residual risks.

Keep long criteria in references/. Put deterministic scripts in scripts/. Put templates in assets/.

The Agent Skills spec recommends keeping SKILL.md under 500 lines and loading resources only as needed. Treat that as a design constraint, not trivia.

`allowed-tools` Is Not a Sandbox#

This is one of the places where a critical read matters.

The Agent Skills spec describes allowed-tools as experimental and related to pre-approved tools. OpenAI's docs describe optional metadata and tool dependencies for smoother invocation. Claude Code's docs are explicit: allowed-tools grants permission for listed tools while a skill is active, but it does not restrict every other tool by itself; deny rules and permission settings still matter.

Some community posts describe allowed-tools as if it hard-restricts tool access. Do not build a security model on that assumption.

If a workflow must be read-only, enforce it with:

permission settings,
sandbox mode,
hooks,
CI checks,
read-only credentials,
branch protection,
human approval,
deterministic scripts that fail unsafe output.

Markdown can guide. It cannot secure.

MCP and `llms.txt`: Dynamic Context Belongs Outside the Prompt#

Static repo files are not enough for fast-moving platforms.

Google's ADK coding with AI guide points developers to development skills, an ADK docs MCP server, and machine-readable docs following the llms.txt standard. The ADK blog shows SkillToolset patterns where skills load references and resources on demand.

This is the correct split:

AGENTS.md says which docs or tools are authoritative.
A skill says when to use them.
MCP or llms.txt exposes current, machine-readable documentation.
The agent pulls details only when the task requires them.

For Google Cloud, Vertex AI, Gemini, ADK, Android, or fast-moving frontend frameworks, this matters. A stale static note can be worse than no note because it gives the agent confidence with old facts.

Use static files for local rules. Use dynamic docs for changing external APIs.

Community Lists Are Discovery, Not Installation Policy#

The openai/skills catalog and awesome-claude-skills are useful for studying patterns. They are not an excuse to install everything.

The community discussion around the Claude Code cheat sheet is a good cautionary example. A generated cheat sheet can be useful and still contain errors, drift, or missing flags. The value is in curation and verification, not in the fact that an agent produced a polished artifact.

The same applies to skills.

Before installing or copying a community skill:

skill-import-review.md

# Skill Import Review
 
- What problem does this skill solve in our repo?
- Does the description trigger only on the right tasks?
- Does it ask for tools or permissions we do not want?
- Are scripts readable and deterministic?
- Are references current?
- Does it mention secrets, credentials, or external services?
- Does it overlap with an existing repo skill?
- What prompt will we use to test it?
- Who owns updates?

Do not let an awesome-list become a supply chain for agent behavior.

Applied Architecture for This Repo Family#

For the dev-sideprojects portfolio, I would standardize this shape:

repo-agent-harness

repo/
  README.md
  AGENTS.md
  CLAUDE.md
  docs/
    content-plan.md
    editorial-style-guide.md
    sops/
  .agents/
    skills/
      add-netlify-form/
        SKILL.md
        references/
        rules/
      pre-launch-verify/
        SKILL.md
        rules/
  .claude/
    rules/
      content-posts.md
      legal-routes.md
    settings.local.json

For other repo types:

Native iOS / on-device ML:

AGENTS.md should name Xcode build commands, no-sign validation, simulator limitations, SwiftUI state rules, MLX/native runtime boundaries, and files agents must not touch.
A skill should exist for "verify local inference behavior" only if it includes device/simulator constraints and exact validation commands.
Do not let a generic Python ML skill touch Swift/Metal/MLX boundaries.

Data engineering:

AGENTS.md should name warehouse commands, migration rules, data-contract locations, environment separation, and forbidden production actions.
Skills should exist for repeated workflows: data contract change, BigQuery policy tag review, Dataform release validation, public-data ingestion, feature table creation.
Reference files should hold source-specific schemas, boundary mappings, and metric definitions.

Frontend / public site:

AGENTS.md should name package manager, test/build gates, design constraints, content rules, analytics consent rules, and legal page rules.
Skills should cover Netlify forms, launch verification, legal route audit, content publishing, and design implementation only when those workflows repeat.

The mistake is mixing all of those in one mega-file. Domain isolation is how the agent stays useful.

The Enforcement Ladder#

Agent Markdown sits low on the enforcement ladder.

Vibe request
Prompt instruction
README guidance
AGENTS.md / CLAUDE.md standing context
SKILL.md task procedure
Scripted checks
Hooks and permissions
CI and branch protection
Runtime access controls
Human approval

Use the lowest layer that is sufficient.

For tone and workflow, Markdown is fine.

For "run npm run test-all before calling this done," Markdown plus agent discipline is often enough.

For "never commit secrets," you need secret scanning.

For "never deploy production without approval," you need IAM, CI approvals, environment protection, or platform controls.

For "never drop a production table," you need permissions and change management.

Markdown should describe the guardrail. It should not be the only guardrail.

What I Would Change Moving Forward#

For every repo, I would add a small harness review to the bootstrap checklist:

agent-harness-review.md

# Agent Harness Review
 
## README.md
- Describes the product or system for humans.
- Points agents to `AGENTS.md`.
- Does not duplicate every workflow rule.
- Commands match package scripts or Make targets.
 
## AGENTS.md
- Names canonical commands.
- Names protected paths and forbidden side effects.
- Names verification gates.
- Stays under a small, readable size.
- Uses nested files only when rules differ by directory.
 
## Tool-Specific Files
- `CLAUDE.md` is either a shim or a Claude-specific extension.
- `.claude/rules` contains path-specific rules, not generic advice.
- Other tools map back to the canonical repo contract.
 
## Skills
- Each skill has one job.
- `description` contains real trigger phrases.
- Long material lives in `references/`.
- Deterministic work lives in `scripts/`.
- The skill has a verification section.
- Imported skills are reviewed before use.
 
## Enforcement
- Critical safety rules are backed by permissions, hooks, CI, or review.
- Markdown-only rules are reserved for guidance, not hard controls.

That checklist is boring on purpose. It catches the failures that cause agents to waste time: stale commands, vague triggers, overstuffed context, mismatched tool files, and rules with no enforcement.

The Sharpest Take#

The repo is no longer just source code.

It is a machine-readable operating environment for humans and agents working together.

README.md tells a person where they are. AGENTS.md tells an agent how to behave. CLAUDE.md and tool-specific files adapt the contract to a client. SKILL.md turns repeated work into load-on-demand procedures. MCP and llms.txt bring live documentation into reach. Scripts, hooks, CI, permissions, and reviews enforce what Markdown cannot.

The hottest version of this trend is not "install 50 skills."

It is building a repo harness that makes good agent behavior cheaper than bad agent behavior.

Start with one root AGENTS.md. Keep it small. Add skills only for workflows you repeat. Put details behind progressive disclosure. Verify every imported skill. Back dangerous rules with real enforcement.

The agent will still make mistakes.

But the mistakes will be smaller, easier to detect, and less likely to become the architecture.

Your Repo Needs an Agent Harness, Not More Prompt Paste

The Repo Audit#

The Mental Model#

README.md: The Human-Agent Manifest#

AGENTS.md: The Always-Loaded Rule Layer#

CLAUDE.md: Compatibility Shim or Claude-Specific Surface#

.claude/rules and Tool-Specific Rules#

SKILL.md: The Action Layer#

The Description Is the Trigger#

A Skill Template Worth Reusing#

allowed-tools Is Not a Sandbox#

MCP and llms.txt: Dynamic Context Belongs Outside the Prompt#

Community Lists Are Discovery, Not Installation Policy#

Applied Architecture for This Repo Family#

The Enforcement Ladder#

What I Would Change Moving Forward#

The Sharpest Take#

`README.md`: The Human-Agent Manifest#

`AGENTS.md`: The Always-Loaded Rule Layer#

`CLAUDE.md`: Compatibility Shim or Claude-Specific Surface#

`.claude/rules` and Tool-Specific Rules#

`SKILL.md`: The Action Layer#

`allowed-tools` Is Not a Sandbox#

MCP and `llms.txt`: Dynamic Context Belongs Outside the Prompt#