Sandboxed Agents and the Production Automation Boundary

Outcome focus: Framed sandboxed agent execution as an architecture boundary for safer, stateful, long-running automation instead of another demo-layer SDK feature.

The important part of OpenAI's April 15, 2026 Agents SDK update is not that agents can run more tools.

That is the surface reading.

The deeper shift is that the SDK is starting to formalize the execution environment around the agent. Files, commands, packages, ports, snapshots, resumable state, mounted data, memory, and sandbox providers are not side concerns anymore. They are part of the application architecture.

That matters because serious agents do not live inside a prompt.

They inspect directories. They compare files. They run code. They modify artifacts. They install dependencies. They produce reports, patches, test results, screenshots, summaries, and machine-readable outputs that another system needs to inspect. They pause for human review. They resume after interruption. They handle more than one task step. They operate near credentials, sensitive data, enterprise systems, compliance boundaries, and users who expect the work to survive failure.

That is not a chatbot problem. It is a systems problem.

For a while, teams could hide that complexity behind demos. The agent could receive enough context to look smart for one interaction. The tool call could be mocked. The workspace could be improvised. The state could be stored in a loose database row. The shell could be wired through a custom container. The recovery story could be "rerun it and hope."

That breaks down fast in production.

The updated Agents SDK is interesting because it gives developers a more explicit vocabulary for the boundary between the model, the harness, and the compute environment. It gives the agent somewhere controlled to work. It gives the application a way to describe what that workspace should contain. It gives long-running work a better chance of being resumed instead of abandoned. It also makes the security conversation more concrete, because model-directed code execution should not be sharing a trust boundary with the rest of the application.

That is the piece worth paying attention to.

Agents need a workspace#

Many agent failures start with an unrealistic assumption: that the model can do the work from context alone.

Sometimes it can. A short answer, a classification, a rewrite, or a structured extraction may not need more than the request and a few tool calls. In those cases, a full sandbox may be extra machinery.

But long-horizon automation is different.

If the agent needs to inspect a repo, reconcile documents, generate files, run tests, open a preview, debug an error, or continue a previous task, it needs a workspace it can inspect and change. The workspace is not just storage. It is part of the reasoning surface.

This is obvious in coding agents, but it applies beyond code.

A contract review agent may need a directory of PDFs, a clause library, a risk rubric, and an output folder for redlines. A data operations agent may need a mounted export, a profiling script, a schema map, and a report template. A support intelligence agent may need CSV files, notebook dependencies, taxonomy instructions, and generated JSONL for a downstream triage system. A marketing operations agent may need CRM extracts, campaign metadata, and a dashboard spec.

Pasting all of that into context is the wrong abstraction.

The better abstraction is a controlled computer environment where the model can navigate files, run commands, and produce artifacts under constraints. OpenAI's sandbox agents docs describe this as an isolated Unix-like execution environment with a filesystem, shell, installed packages, mounted data, exposed ports, snapshots, and controlled access to external systems.

That changes the engineering posture. The agent is not just responding. It is operating.

The harness and compute boundary#

The most important design idea in the release is the split between harness and compute.

The harness is the control plane around the agent. It owns the agent loop, model calls, tool routing, handoffs, approvals, tracing, recovery, and run state. Compute is the sandbox execution plane where the agent's work happens against files, commands, packages, storage mounts, exposed ports, and snapshots.

That boundary is not academic. It is where production safety starts.

If the harness and compute are blurred together, the application has a harder time controlling risk. Credentials can leak into the same environment where model-generated code executes. Audit trails can become tangled with temporary workspace state. Recovery can depend on whether one container survives. Scaling becomes awkward because every run carries too much responsibility in one place.

Separating them gives the application room to be deliberate.

The trusted side can keep authentication, billing, policy decisions, approvals, audit logs, and orchestration state. The sandbox side can run narrow, task-specific work with the least data and credentials needed. If the sandbox expires or fails, the run should not become unrecoverable just because the filesystem vanished.

This is also where enterprise architecture starts to look less like "let the model use a tool" and more like a real execution design.

Which credentials belong in the sandbox? Which data should be mounted read-only? Which outputs should be persisted? Which commands should be allowed? Which actions need approval? Which artifacts can another system trust? Which state belongs to the run, and which state belongs to reusable memory?

Those questions are more valuable than a bigger prompt.

Manifests make the workspace explicit#

One of the cleaner ideas in the sandbox design is the manifest.

A manifest describes the starting contents and layout of a fresh sandbox workspace. It can define files, directories, repositories, mounted storage, output directories, environment variables, users, and groups. The key is that the manifest is a contract for the workspace, not a vague pile of setup logic hidden somewhere in orchestration code.

That is useful because agents are sensitive to environment shape.

If the input files are in different locations across runs, instructions get brittle. If output locations are unclear, the application has to search for artifacts. If mounted data is too broad, the agent sees more than it needs. If task instructions live only in prompt text, the work is harder to inspect, reproduce, or hand off.

A good manifest makes the workspace legible:

Put source repos, input artifacts, and output directories where the agent expects them.
Keep task specs and local instructions in workspace files when they are part of the work.
Use relative paths in instructions so the setup is portable.
Mount only the storage the agent should read or write.
Treat secrets as runtime configuration, not prompt content.

That last point deserves its own weight.

Secrets should not be taught to the model in instructions. They should not be committed into manifests. They should not appear in generated artifacts. The agent may need access to package registries, storage mounts, or provider APIs, but the application should inject those credentials through the runtime or sandbox provider with narrow scope.

This is one of those areas where the architecture expresses the security model. If the only way to make an agent useful is to paste sensitive information into its instructions, the design is already in trouble.

State is more than chat history#

Long-running agents need state, but state is easy to underspecify.

There is conversation history. There is workspace state. There is run state. There is memory. There are generated artifacts. There are decisions made by human reviewers. There are traces and tool outputs. There may be snapshots of the environment at different points in time.

Those are not the same thing.

Conversation history helps the model understand what has already been said. Workspace state preserves the files, directories, and generated outputs that the agent was working with. Run state helps the harness resume the agent loop. Memory carries reusable guidance, preferences, corrections, or lessons across runs.

If a system mixes these together, it becomes hard to reason about recovery.

Imagine an agent building a report from a mounted data room. It writes intermediate notes, runs a script, generates a chart, pauses for review, then resumes after the reviewer asks for a section to be revised. The next run needs the prior conversation, but it also needs the workspace files and the current output artifact. It may need the sandbox session state or a snapshot. It probably does not need every token from every earlier model step. It should not rely on a fragile replay of the entire task from the beginning.

This is why resumability matters.

The docs describe several paths: reuse a live sandbox session, resume from stored run state, resume from serialized sandbox session state, or create a fresh session from a manifest. That gives developers a more durable way to continue work when a run pauses, fails, or moves to a new environment.

The product implication is simple. An agent that cannot resume safely is not ready for work that matters.

Memory should be scoped#

Memory is powerful, but it is also a place where careless systems accumulate confusion.

The updated sandbox docs distinguish sandbox memory from SDK-managed conversational session memory. That distinction is useful. Sessions preserve message history. Sandbox memory can distill reusable lessons from prior workspace runs into files the agent can read later.

That sounds small, but it affects product behavior.

An agent may need to remember that a repo uses a specific test command, that a team prefers a certain report format, that a data export has known column quirks, or that a previous run found a recurring failure pattern. It does not need to carry forward every detail from every run. It does not need to turn temporary mistakes into permanent assumptions.

Good memory design should answer a few questions:

What should become reusable guidance?
Who can inspect or edit that memory?
When should memory expire?
How does the agent distinguish a durable preference from a one-time instruction?
What happens when memory conflicts with the current task?

Stateful agents feel impressive when they remember. They become risky when they remember the wrong things without a correction path.

Sandboxes do not remove evaluation#

The release makes agents more capable. It does not make them automatically reliable.

That distinction matters.

A sandbox gives the agent a controlled place to work. It can reduce custom infrastructure. It can make runs more portable across local and hosted providers. It can help isolate execution. It can make state and recovery more explicit.

But the agent can still misunderstand a task. It can still edit the wrong file. It can still overuse a tool. It can still produce an artifact that looks plausible and fails a downstream contract. It can still ignore an instruction, mishandle a conflict, or optimize for completing the task instead of preserving system safety.

Sandboxing is an execution boundary, not a quality guarantee.

Production agent systems still need evaluation. They need representative task suites, trace review, failure taxonomies, artifact checks, permission tests, red-team prompts, latency budgets, cost controls, and release gates. For workflows that act on real data or write back to systems, they need human review at the right points.

The main benefit is that evaluation can now observe more of the real work.

Instead of testing only model text, teams can test filesystem changes, generated reports, command outputs, snapshots, retries, permissions, and recovery behavior. That is closer to the actual system. It also exposes failure modes that are invisible in prompt-only evals.

If an agent is supposed to produce a CSV, run a validation script, and summarize the result, the evaluation should check the CSV, the script output, and the summary. If an agent is supposed to edit a repo, the evaluation should inspect the diff and run tests. If an agent is supposed to work with a mounted data room, the evaluation should verify citations, file access boundaries, and output location.

The sandbox makes that kind of evaluation more natural.

The failure modes move#

When agents become more capable, the failure modes move from answer quality into system design.

The first failure mode is overbroad access. The agent receives more files, mounts, credentials, or commands than the task requires. That creates unnecessary risk and makes the trace harder to reason about.

The second is unclear workspace contracts. The agent does not know where inputs live, where outputs belong, or which files are authoritative. The result is wandering behavior and brittle prompts.

The third is fake resumability. The product claims long-running support, but the run can only resume if the same container stays alive and no intermediate state is lost.

The fourth is memory pollution. Lessons from one task leak into another, or temporary user preferences become persistent behavior without review.

The fifth is orchestration sprawl. Teams split work across many agents too early, adding handoffs, prompts, traces, and approval surfaces without improving reliability. OpenAI's orchestration docs make a useful point here: start with one agent when possible, and add specialists only when a different contract is truly needed.

The sixth is tool optimism. Developers assume that because the agent can run commands, it should. A shell is powerful, but it is not a substitute for clear instructions, narrow tools, deterministic validation, or review.

The seventh is artifact trust. A downstream system accepts generated files without validating schema, provenance, completeness, or safety.

These are not reasons to avoid sandboxed agents. They are reasons to design them seriously.

What I would build differently now#

This update changes how I would frame enterprise agent architecture.

I would start by classifying the workflow.

If the task is short, stateless, and does not need a workspace, I would keep it simple. Use a direct API path or a basic agent runtime. Do not introduce a sandbox just to make the diagram look mature.

If the task needs files, commands, generated artifacts, mounted data, previews, or resumability, I would treat the sandbox as a product boundary from the beginning.

That means defining the workspace contract before polishing prompts. What inputs are mounted? What outputs are expected? What commands are available? What dependencies are installed? What credentials are injected? What should be snapshotted? What state is recoverable? What memory is allowed to persist?

Then I would define the review contract.

Which actions can proceed automatically? Which actions need human approval? Which generated artifacts are machine-checked? Which failures should stop the run? Which failures can be retried? Which traces matter for audit?

Then I would define the evaluation contract.

What does a successful run produce? What should never happen? What are the known hard cases? What does recovery look like? What happens if the sandbox expires? What happens if a mounted file is missing? What happens if the agent writes output to the wrong path?

The implementation details can vary. The architectural questions should not.

Why this matters for enterprise automation#

Enterprise automation fails when teams underestimate the environment around the model.

The model is important, but the work also depends on data access, permissions, files, tools, task state, logging, review, evaluation, and recovery. Those pieces decide whether the system is a useful automation layer or a clever prototype with a fragile runtime.

The April 2026 Agents SDK update is meaningful because it moves more of that environment into first-class SDK concepts.

Sandbox agents give developers a controlled execution plane. Manifests make workspace setup explicit. Snapshots and resumable state make long-running work less brittle. Memory gives repeated workflows a place to carry durable lessons when scoped correctly. Harness and compute separation gives security architecture a clearer boundary. Orchestration guidance helps teams avoid splitting agents before the workflow contract requires it.

None of this removes the hard parts.

Developers still need to design permissions, validation, observability, failure handling, and human review. They still need to evaluate workflows against real tasks. They still need to decide where automation should stop. They still need to keep business outcomes visible, not just agent traces.

But the direction is right.

The agent is no longer only a model with tools. It is becoming a runtime pattern with a workspace, a control plane, state, recovery, and explicit execution boundaries.

That is what production agent work has needed.

The April 2026 follow-up on ADK 2.0 takes this control-plane and compute-plane split and applies it to a single SDK's primitive set, with a working capability matrix and a known production gap.

Sandboxed Agents and the Production Automation Boundary

Agents need a workspace#

The harness and compute boundary#

Manifests make the workspace explicit#

State is more than chat history#

Memory should be scoped#

Sandboxes do not remove evaluation#

The failure modes move#

What I would build differently now#

Why this matters for enterprise automation#

Sources#