§1Governance is not a board paper once the agent can act
Most AI governance still assumes the dangerous thing is the answer. That made sense when the system was a chatbot producing text for a human to accept, ignore or copy. It makes less sense when the system can call tools, inspect files, open tickets, query customer records, write code, book services or send messages into another operational system.
Microsoft’s Agent Governance Toolkit is notable because it treats agent governance as a runtime problem. The important word is not governance. It is toolkit. The repository is presented as runnable code covering policy enforcement, zero-trust identity, execution sandboxing and reliability engineering, mapped across the OWASP Agentic Top 10. That is a different posture from another taxonomy, maturity model or board slide.
For regulated buyers, this distinction matters. A local authority, bank, health infrastructure provider or education body cannot safely buy an agent system on the basis that the vendor has a responsible AI policy. The operational question is narrower and harder: what stops the agent doing the wrong thing at the point of action?
If an agent tries to fetch a resident record, escalate a complaint, call an external API or write into a case-management system, governance has to be in the execution path. It has to see the actor, the instruction, the tool, the data class, the proposed action and the policy decision before the action happens. Anything else is after-the-fact reporting. Useful for incident review, poor as a control.
The Microsoft toolkit is therefore interesting less as a Microsoft artefact and more as a signal. The substrate is moving. Agent governance is being pulled down from policy committees into identity, sandboxing, logs, permissions, tests, retries, timeouts and denial paths. That is where serious buyers should be looking.
§2The control plane moves to every tool call
The weak version of agent safety is prompt-level instruction. Tell the model not to access sensitive data. Tell it not to exfiltrate secrets. Tell it to ask for confirmation before it performs a high-impact task. This is better than nothing, but it confuses a behavioural request with an enforceable boundary.
Agents need a control plane around tool use. Before an action executes, the runtime should be able to ask: is this agent allowed to use this tool; is this user allowed to request this action; does this data class require a higher approval threshold; is this destination permitted; has the request been modified by untrusted content; is the action inside the declared purpose of the task?
That is why the toolkit’s emphasis on policy enforcement is more important than any one model benchmark. Prompt injection, excessive agency, insecure tool use, unauthorised access and data leakage are not solved by asking the model to behave. They are reduced by moving from instruction to mediation. The runtime must mediate the agent’s relationship with tools, files, identities and networks.
There is a simple test for this. If the agent produces a dangerous tool call, does the system rely on the model changing its mind, or can an external policy layer deny the call? The first is persuasion. The second is governance.
OWASP mapping helps because it gives teams a shared threat vocabulary, but the value is not the mapping itself. The value is the translation from threat class to executable guardrail. A regulated buyer should not stop at the phrase covers the OWASP Agentic Top 10. They should ask which control blocks which failure mode, where the decision is logged, who can change the policy and whether the denial is testable in a pre-production environment.
§3Identity is the hard line between agent and operator
The identity problem is where many agent pilots become unsafe. If an agent acts through a broad human account, it inherits permissions that were never designed for autonomous use. The audit trail then says Alfred accessed the record, when in practice an agent operating under Alfred’s session did so after interpreting a task, a document and a tool schema.
Zero-trust identity for agents means treating the agent as its own actor. It should have scoped credentials, bounded authority, short-lived access, explicit purpose and per-action evidence. It should not inherit a user’s entire permission set simply because that user clicked start. The agent may be acting on behalf of a person, but it is not the person.
This matters for UK regulated environments because auditability is not decorative. Under UK GDPR security obligations, ICO AI guidance, FCA model risk expectations and public-sector accountability duties, organisations need to reconstruct who or what did what, under which authority and with which control decision. Agent identity has to be legible in those records.
The same point applies to approval. Human approval should be a capability with evidence, not a vague message in a collaboration channel. Which action was approved; by whom; for which data; for which time window; under which policy; was the agent allowed to alter the action after approval; what happened if the approval expired? These are substrate questions, not ethics questions.
Microsoft putting zero-trust identity next to policy enforcement is therefore the right emphasis. Identity is not an access-management footnote. It is the thing that lets the organisation separate user intent, agent interpretation, tool authority and operational responsibility.
§4Sandboxing is not optional when agents can improvise
An agent that can plan, write code, browse, call APIs or manipulate files is a system that can improvise. That improvisation is the point of using it. It is also the reason sandboxing becomes a production requirement rather than a security add-on.
Execution sandboxing answers a blunt question: if the agent does something unexpected, how far can the blast travel? The sandbox should limit filesystem access, network egress, tool availability, credential exposure, process lifetime and interaction with production systems. For agents that run code, this is especially important. Generated code is still code. It can read, write, loop, call, leak and fail.
The reliability part of the toolkit should not be treated as separate from governance. Retries, timeouts, circuit breakers, fallback paths and state management are governance mechanisms when the system can act. A runaway loop is not just an engineering defect. In a regulated service, it can become repeated customer contact, duplicated case updates, excessive data access or uncontrolled spend.
Buyers should also care about replay. When something goes wrong, it is not enough to retain the final answer. The useful record is the event stream: prompt, context, retrieved material, policy decision, identity, tool call, tool response, retry, denial, escalation and final output. Without that, the organisation is left arguing from screenshots.
This is where agent infrastructure starts to look less like model deployment and more like transaction processing with probabilistic components. The model can be non-deterministic. The runtime cannot be casual. It needs stable control points, durable logs and clear failure behaviour.
§5What regulated buyers should ask next
The practical lesson from the toolkit is procurement discipline. If a supplier says their agent is governed, ask to see the enforcement path. Not the architecture diagram. Not the AI principles. The path.
- Show a tool call that is allowed, and the policy record that allowed it.
- Show the same tool call denied because the data class, user, destination or purpose changed.
- Show the agent identity separately from the human requester.
- Show the sandbox boundary, including network and credential restrictions.
- Show the logs needed for incident review and regulatory evidence.
- Show how a policy change is approved, versioned, tested and rolled back.
These questions are not hostile. They are the minimum needed before moving an agent from a controlled pilot into a live service. For FCA-regulated firms, they connect to model risk management, operational resilience and third-party oversight. For local authorities, they connect to democratic accountability, records management, FOIA exposure and service fairness. For health and education infrastructure, they connect to confidentiality, safeguarding and continuity of service.
The deeper shift is that agent assurance cannot be outsourced entirely to a model provider. A capable model can still be embedded in a weak runtime. Conversely, a less glamorous model in a well-governed runtime may be the safer production choice. The buyer’s attention needs to move from model claims to control evidence.
§6What the toolkit does not solve
Microsoft’s Agent Governance Toolkit should not be read as certification. OWASP coverage is useful, but it is not proof that a deployment is safe, lawful or suitable for a specific public service, financial workflow or clinical-adjacent process. Mapping threats is not the same as validating controls under local conditions.
The toolkit also cannot decide an organisation’s risk appetite, lawful basis, data-retention rules, human-review thresholds or service-design obligations. It does not remove the need for DPIAs, model evaluations, accessibility testing, incident rehearsals, supplier due diligence or clear accountability between business owner, technology owner and risk owner.
Nor does it make policy writing easy. A bad policy enforced perfectly is still a bad policy. Over-broad denial can make an agent useless. Under-specified permission can make it dangerous. The hard work is converting operational rules into machine-enforceable decisions without hiding judgement inside vague categories.
That said, the toolkit points in the correct direction. The agent era will not be governed by better disclaimers. It will be governed, if it is governed at all, by identity boundaries, policy checks, sandbox limits, reliability controls and evidence trails built into the runtime. This is the substrate buyers should ask for before they let agents near production work.
