Using AI to find authorization bugs — and to prove the ones that aren't real

Using AI to find authorization bugs — and to prove the ones that aren't real Draft flagship post. Safe to publish now (no undisclosed vulnerabilities). The production case study referenced at the end is withheld pending coordinated disclosure. In 2026, bug bounty programs started closing their doors. Nextcloud suspended paid rewards, citing a flood of AI-generated, low-quality reports. Mattermost ended its program. The Internet Bug Bounty cut payouts by roughly 80%. The common thread isn't that
Using AI to find authorization bugs — and to prove the ones that aren't real
Draft flagship post. Safe to publish now (no undisclosed vulnerabilities). The production case study referenced at the end is withheld pending coordinated disclosure.
In 2026, bug bounty programs started closing their doors. Nextcloud suspended paid rewards, citing a flood of AI-generated, low-quality reports. Mattermost ended its program. The Internet Bug Bounty cut payouts by roughly 80%. The common thread isn't that AI can't find bugs — it's that most AI-assisted "findings" are plausible but wrong, and triage teams are drowning in them.
That reframes the problem. The scarce skill in 2026 isn't generating candidate vulnerabilities — a language model will hand you fifty before lunch. It's refuting the forty-nine that don't hold. The differentiator is a method whose primary output is correct negatives.
Here's the method I use for source-available targets, and a worked example where the honest result was "there's no bug here."
The method: fan out to find, converge to refute
Two stages, two different cost tiers:
Fan-out (cheap models). Split the target's authorization surface into subsystems and read each in parallel. Each reader's only job is to surface candidate broken invariants — places where an object is loaded by ID without an owner check, where a protected action might skip a re-auth gate, where two code paths authorize the same thing differently. Optimize for recall. Expect mostly false positives.
Adversarial verification (an expensive, high-reasoning model). Take each candidate and try to kill it. Default to REFUTED. A candidate survives only if you can cite the specific source lines proving the guard is absent and the dangerous path is reachable and nothing upstream already blocks it. Frame every survivor as a broken invariant — a one-sentence statement of the rule the system must never violate — and classify it as core versus config-dependent.
The output that matters most is the pile of refutations, each with a reason. That pile is what separates a report a triager trusts from a report that gets a program suspended.
Worked example: Ory Kratos settings & OIDC (verdict: defended)
Ory Kratos is an open-source identity server. Its settings flow concentrates several privileged actions in one place — password change, email/recovery-address change, OIDC provider link/unlink — which is exactly the kind of surface where one inconsistent check becomes account takeover. A good place to look.
The fan-out surfaced a genuinely tempting candidate. In the OIDC strategy, the continuity container that survives the identity-provider round-trip is not identity-bound: at both pause and resume, the code omits the WithIdentity option, so the owner check short-circuits on a nil identity. It is the one pause/continue pair in the codebase that skips this binding. If you were pattern-matching for "missing check," you would write this up as a high-severity flow-hijack and hit submit.
That would be slop. Here's why it's defended.
The missing binding is only exploitable if the resume path derives its write target from the unbound container. It doesn't. On the settings callback, the target identity comes from the live session cookie, the privilege gate is re-checked at the callback (not just at initiation), and the container's traits are never applied. On the login/registration callback, the target identity is derived from the cryptographically-signed OIDC subject in the validated token — again, not from the container. So the worst an attacker can do by replaying someone else's paused flow is link a provider to their own account, or trip a collision check. The absent WithIdentity is defense-in-depth that isn't load-bearing anywhere reachable.
I checked the neighbors too. The privileged-session re-auth gate is enforced uniformly across every settings method. The "silent account merge" branch that would let an unverified email link onto an existing account is dead code in the open-source build — it's gated behind a policy hook that ships as nil. Each of these looked like a lead; each closed cleanly against the source.
Verdict: the surface is well-built. Writing it up would have burned credibility for an "informational." The value delivered was the confidence to not file — and a precise map of why the design holds.
The class that does pay off
The same method, pointed at a different target, surfaced a real one: an authorization guard that the main code path enforces but an alternate ingress fails to mirror. A system revokes a user's access on the primary path, but a secondary, credential-authenticated entry point re-implements the authorization check and forgets one condition — so the "revoked" user still gets in through the side door. This class is dup-resistant (it requires understanding the specific system's trust model, not spraying requests), scanner-invisible (the semantics matter, not the syntax), and high-impact. Details after coordinated disclosure completes.
Why this is the durable skill in 2026
The gatekeepers are raising walls specifically against volume. The researchers who thrive on the other side of those walls will be the ones whose AI-assisted work is rigorous: findings framed as broken invariants, grounded in source, and — above all — filtered by a verification step that throws most of them away. Use AI to read more of the code than you ever could by hand. Then use it, adversarially, to prove yourself wrong before a triager has to.
Signal, not volume. That's the whole game now.


