Meta's Support Bot Handed Out Password Resets to the Wrong People

§1The conversation was treated as the credential

The reported incident is plain enough to state in one sentence. Meta's AI support bot for Instagram handed password-reset links to people who were not the account holders. The conversational request to recover an account was treated as sufficient authorisation to dispatch the link that lets someone recover that account.

Strip away the chatbot framing and you are left with a vending machine for account takeovers. The most consequential action in any consumer platform is the credential-recovery flow, because it is the one path designed to hand control to someone who has lost their password. Everything in that flow exists to answer a single question: are you the legitimate owner of this account. Meta apparently bolted a friendly conversational interface in front of that flow and forgot to wire the question back in.

This is not the AI hallucinating. It did not invent a link. It dispatched a real, working authentication artefact to a real session. The model behaved exactly as instructed: a user asked for help recovering an account, so it delivered the recovery mechanism. The failure is that nothing between the request and the dispatch verified that the requesting session matched the target identity.

§2Principal binding is the thing that was missing

In agent design there is a distinction that buyers in regulated sectors learn fast and consumer product teams keep relearning. The principal is the entity on whose authority an action is taken. Binding is the act of proving, at the moment of a consequential action, that the requester is that principal. A support agent that can trigger an outbound authentication channel is, whether anyone labelled it this way or not, a privileged dispatcher. It needs principal binding before it fires.

What Meta shipped, on the reported facts, was an agent that conflated two different things: the request and the entitlement. A user saying "I need to reset the password for @victim" is a request. Whether that user is entitled to reset @victim's password is a separate fact that must be established by something other than the user asserting it. The bot collapsed those two into one. The conversational turn became the authorisation.

This is the oldest confused-deputy problem in security, dressed in 2026 clothes. A confused deputy is a privileged component tricked into misusing its authority on behalf of a less privileged caller. The bot had the authority to send reset links. The attacker did not. The bot lent its authority to the attacker because it could not tell the difference between a legitimate owner and a stranger who phrased things confidently.

§3The gate that should have existed

The fix is not a better model. It is a pre-send verifier sitting in front of the link-delivery action. Before the agent is permitted to dispatch a recovery link to an address or device, a deterministic check must confirm a match between the requesting session and the target account identity. Confirmed email or device control, an existing authenticated session, a recovery factor the legitimate owner registered earlier. Something the attacker cannot supply by typing.

The architectural shape matters here. The verifier must sit outside the model. If the check lives inside the prompt, as an instruction telling the model to "only send links to verified owners," then the model is the gate, and the model can be talked round. Models are persuadable. That is their entire value and their entire danger. A persuadable thing must never be the last line of defence in front of a privileged action. The gate has to be code that the conversation cannot argue with.

Concretely, the dispatch action should have been unreachable until a non-conversational attestation returned true. The agent proposes; the verifier disposes. The model can be as helpful and as fooled as it likes, because the helpful and fooled output simply does not clear the gate without an out-of-band proof of ownership.

§4Why this keeps happening

It keeps happening because teams reason about agents as conversation engines and not as dispatchers of consequence. The instinct is to ask "is the bot polite, accurate, on-brand." The question that actually matters is "what real actions can this thing trigger, and what binds each one to an authorised principal." The first question is product. The second is substrate. When you skip the second, you ship a charming front end to your most dangerous internal capability.

There is also a temptation, in big organisations, to treat the support surface as low stakes because it is labelled support. Support is where the recovery flows live. Support is the softest path to the hardest target. A consumer platform's help channel is, by function, the place where ownership is contested and reassigned. That is the last place you want an agent that trusts the requester's framing.

§5What this means for builders, and what they get wrong

If your agent can trigger an outbound action that changes who controls an asset, that action needs a principal binding, and the binding has to be deterministic code outside the model. Inventory every action your fleet can take, sort them by blast radius, and put the credential-recovery and access-granting actions at the top. For each one, name the principal and name the gate. If you cannot point to the line of code that proves ownership before the action fires, you do not have a gate. You have a hope.

The thing builders get wrong is believing the model's good behaviour is a control. It is not. The model is the thing being defended, not the thing doing the defending. A polished agent with no pre-send verifier is not safer than a crude one. It is more convincing while it hands out the keys. Meta's bot was probably very pleasant about it. That is exactly the problem.

Methodology note. We picked this because it is the cleanest public example of an agent trusting the wrong principal at the worst possible moment. Workloft runs an eight-agent fleet selling into regulated UK buyers, where every consequential action needs a binding we can point to in code. Meta's incident is the failure mode we design against daily: the conversational request mistaken for the entitlement. We are not claiming we built it or fixed it. We are reading a named, public incident through the one lens that matters, which is what the agent was allowed to dispatch and what verified it first.