NanoClaw V2: Human-in-the-loop approvals and Agent to Agent collaboration

23 באפריל 2026 · Gavriel Cohen

NanoClaw V2 is a rewrite of NanoClaw from the ground up. The new version adds human-in-the-loop approvals through rich cards in messaging apps, and persistent agent-to-agent communication. The combination of these two capabilities with NanoClaw’s existing isolation model can effectively solve the tradeoff between agent capability and safety.

The deal OpenClaw offered

OpenClaw made an implicit offer: set aside all concerns about security and safety, get tremendous value from agents. A lot of people took the deal. The value is real. But so is the risk. For most businesses, the trade wasn’t available to begin with.

The common story is that agent capabilities exist and enterprises are holding back. That’s wrong. The real story is that serious frameworks didn’t ship the capabilities that need access to sensitive data and sensitive actions, because they couldn’t ship them safely.

That matters because agents become more valuable the more sensitive the data they can reach and the more consequential the actions they can take. A research-only agent with no credentials is useful. An agent that can actually do your work is transformative. The safe-to-deploy use cases are the low-value ones by definition. Most of the value sits on the other side of this problem.

The trifecta is one piece of a bigger problem

Simon Willison’s lethal trifecta (private data, untrusted content, external communication) is the canonical frame for prompt-injection-driven exfiltration. It’s the right frame for that attack. But the same preconditions enable a broader class of harm. An agent tricked into issuing a refund, updating a database, deleting emails, sending messages on your behalf. Nothing is exfiltrated. Something sensitive was done.

The generalization: private data + untrusted content + ability to take sensitive actions. Exfiltration is one. Destructive internal operations are others. Defenses scoped to egress don’t cover the rest. You have to think about action, not just communication.

Three problems, three techniques

NanoClaw makes a choice that shapes everything else: sandbox the agent itself, not individual tool calls. Two reasons.

Tool calls aren’t dangerous or safe in isolation. rm -rf project looks dangerous. But if the agent ran mkdir project moments before, it’s just cleanup. A call that looks safe can be destructive in context. You can’t enforce at the operation level. What matters isn’t every swing of the agent’s hammer, it’s what comes out of the black box.

The other reason: the agent needs room to operate freely, or it can’t get real work done. In NanoClaw each agent runs in its own Docker sandbox, with isolation enforced at the OS level. Let it run wild inside. Apply controls at the boundary.

Now three problems.

Credentials. Agents need to use credentials to be useful. But they don’t need to see them. In NanoClaw, each agent has a OneCLI vault that sits next to the sandbox as a man-in-the-middle proxy. The agent calls a permissioned API; the vault enforces policy and injects the credential. The agent never touches the secret. Sensitive credential use can require human approval before the call goes through.

Accountability. Every sensitive action needs a human name on it. Approvals are a governance mechanism. The person who approves is accountable. This sidesteps the standard identity dilemma where you either give the agent your own identity or give it its own. Identity becomes per-action, attached to approval.

Approvals are what make vaulting safe against injection. The vault authorizes the credential; the approval authorizes the action. Coupled, not independent.

Blast radius. The agent you interact with doesn’t need to read the web. Agents should be split by sensitivity. A research agent has internet access and no sensitive data. An action agent has sensitive data and no internet. Each is a first-class, persistent agent, with its own environment, instructions, tools, memory, and permissions. Not ephemeral sub-processes. They can communicate through approval-gated calls. You approve the question going out. You approve the response coming back, sanitized: plain text, length-limited, stripped of the usual injection patterns.

Ergonomics is the unlock

None of these techniques are new on their own. Vaults exist. Approval frameworks exist. Multi-agent orchestration exists. The reason nothing has been put together for highest value real-world use is ergonomics, both for the agent and for the human.

Agent ergonomics first. An agent that’s constantly bumping up against the limitations of its environment can’t get work done. Agents want to run free. That’s why the sandbox goes around the agent, not around each tool call. Inside the box the agent operates without friction. Controls live at the boundary.

Human ergonomics is where approvals live or die.

Approvals that interrupt the user every few seconds kill productivity. Approvals that require a dashboard visit die from neglect. Approvals that arrive with no context get rubber-stamped, which is worse than no approval at all, because it manufactures consent.

The approach that works: approvals come to where the user already is. Vercel’s Chat SDK is the messaging layer. Slack, Teams, WhatsApp, wherever the team already works. With enough context to evaluate. Only at genuinely sensitive moments. There’s real engineering around it: policy analysis to reduce requests, surfacing the information that matters for each decision. Not a solved problem, but engineering, not a hard physics problem.

It’s a dial

The old framing said: safety or capability, pick one. Most individuals and companies picked safety and got no valuable agents. Some picked capability and exposed themselves.

The new framing is a dial. You tune each setting to your use case: approval frequency, sanitization strictness, compartment boundaries, credential scope. High-sensitivity workloads crank one way, low-sensitivity the other.

Trade-offs haven’t disappeared. Latency and configuration work exist. But they’ve been massively mispriced, and most are one-time architectural cost, not an ongoing tax.

OpenClaw’s deal: accept the risk, get the value. That trade wasn’t the only way. The pieces were there. They needed the right architecture and ergonomics. That’s NanoClaw V2. Agents that are totally locked down. And totally free.