Claude Code Now Decides Which Commands Are Safe. That Is the Whole Game.

Today

Claude Code shipped auto mode last weekend. The agent decides which commands are safe to run without asking.

This is the question I keep coming back to.

Until now: OpenClaw says the user configures it. Anthropic said sandbox everything, confirm before acting. Both break at scale.

Auto mode is a third answer. The model itself draws the line. Not the user. Not a static config.

Same weekend, METR red-teamed Anthropic's agent monitoring and found novel vulnerabilities. The company building safety tools has gaps in its own safety tools.

(Nobody knows yet if auto mode draws the line in the right place. That is the honest answer.)

The permission model is the product. Everything else is a demo.

If you are deploying agents with any level of autonomy, I would like to hear how you draw the line.