Thought Leadership January 14, 2026 9 min read

Approval Fatigue Is an Enterprise Risk in Agent Sandboxes

If approvals and sandboxes live in personal settings, policy becomes a suggestion. Fatigue turns security decisions into muscle memory at exactly the wrong time.

Approval fatigue and agent security

Executive Summary

Vendor guardrails are improving, but the critical controls are still user-driven: approvals, allowlists, and sandbox toggles. In enterprises, that turns security into a human-factor problem: uneven behavior across teams, inconsistent training, and approvals that get clicked through when the pressure to ship is highest.

Enterprise Policy Gap: Controls Default to the User

Enterprise policy does not exist if controls live in user settings. Codex CLI uses slash commands like /approvals and /status to manage approval modes and sandbox scope. Claude Code uses /sandbox to enable OS-level isolation and domain allowlists via a proxy. Cursor includes allowlists and "Run Everything" options, but explicitly says those are not security controls. These are powerful features, but by default they live in per-user settings and ad-hoc decisions, not centrally enforced company policy.

YOLO Mode: One Switch, No Guardrails

That policy gap turns into real risk the moment a single toggle disables every safeguard. Codex exposes a dangerous convenience flag: --yolo (alias for --dangerously-bypass-approvals-and-sandbox). It disables both approvals and the sandbox entirely. This is fine inside a locked-down container, but in an enterprise it is effectively “turn the keys over to the agent.” Community threads show users reaching for --yolo when approvals feel too frequent.

Approval Fatigue Scales Faster Than Security Training

Anthropic reports that sandboxing in Claude Code reduces permission prompts by 84%. That is good for productivity—but it also reveals the core human issue: if the system relies on repeated approvals, users will click through. The more prompts you show, the less scrutiny each one gets.

Concrete Failure Scenarios

  • Accidental secrets upload: a debug bundle includes ~/.ssh or .env, then gets pushed.
  • PII leakage: an agent extracts a customer export for “analysis” and uploads it to a gist or ticket.
  • Destructive remote action: a routine cleanup step runs kubectl delete or terraform destroy against prod.
  • Wrong-repo mutation: the agent commits or force-pushes to the wrong remote or branch.
  • Shadow network calls: a tool fetches from unapproved domains because the user “just allowed it once.”

Why This Is an Enterprise Problem, Not a User Problem

Those human factors compound at scale; one person's shortcut becomes an enterprise failure mode. In enterprise environments, the security boundary is not the individual developer. It is the company policy. If the controls depend on a user remembering to keep sandboxing on, or to avoid YOLO mode, the organization does not actually have a control plane. That creates inconsistent outcomes across teams and shifts accountability from policy to personal judgment.

What Enterprise-Grade Controls Should Look Like

Minimum Viable Guardrails

  • Central policy enforcement: lock sandbox modes and approval policies at the org level.
  • Audit logs: record approvals, allowlist changes, and any escape hatches used.
  • Least privilege by default: scoped tool access, isolated environments, and domain allowlists.
  • Training + playbooks: clear guidelines for what to approve, when to pause, and when to escalate.

How AARSM Helps

AARSM lets security teams lock approvals and sandbox policies centrally and audit every override. That way safeguards do not depend on a tired developer making the right call under deadline pressure.


About This Analysis

This analysis draws on OpenAI Codex documentation, Anthropic’s Claude Code sandboxing write-up and docs, Cursor’s agent security guidance, and community discussions about approval fatigue and YOLO mode.

Related Articles