Tag: enterprise ai
4 entries tagged "enterprise ai" — 4 posts, 0 links.
Posts
A practical evaluation loop for multi-agent workflows that catches demo-friendly failures in task handoff, tool use, permissions, latency, and completion criteria before release.
Outcome: Established a repeatable evaluation workflow that gates multi-agent releases on task completion, handoff quality, tool correctness, latency, and recoverability instead of demo impressions.
Capability is no longer the hard question about AI agents. What the agent is permitted to do, and whether it will do it successfully, are. Here is why that distinction matters architecturally.
Outcome: Reframed agent deployment decisions around permission scope and blast radius rather than capability, reducing the risk of production failures from over-permissioned agentic systems.
OpenAI's April 2026 Agents SDK update matters because sandboxed execution, manifests, resumable state, and memory move agents closer to real production automation.
Outcome: Framed sandboxed agent execution as an architecture boundary for safer, stateful, long-running automation instead of another demo-layer SDK feature.
A practical map of NVIDIA NeMo for teams that want to curate data, fine-tune open-source LLMs, evaluate them, and move from research checkpoints to production inference.
Outcome: Separated data curation, fine-tuning, alignment, evaluation, export, and serving concerns so open-source LLM customization could move from experiments to governed production workflows.
All tags