AI Assistants Are Rewriting the Threat Model, Not Just the Workflow

The rapid spread of agentic AI assistants inside developer and IT environments is forcing security teams to rethink long-settled assumptions about insider risk, code provenance, and what counts as “running with the user’s permission,” according to a wave of recent incident reporting and vendor disclosures.

In a March analysis ↗, KrebsOnSecurity catalogued a string of high-impact failures involving autonomous AI agents that quietly shifted the security goalposts for organizations that adopted them. The pattern across the incidents is consistent: an agent with broad authority over a workstation, a repository, or a corporate inbox executes attacker-supplied instructions hidden inside ordinary content, then pivots through the trust the user already had.

Agents are arriving faster than the controls to govern them

Coding copilots, IT-helper agents, and personal AI “butlers” are now sitting inside developer laptops and SaaS tenants with persistent OAuth scopes, terminal access, and the ability to spawn sub-tasks. That combination of access patterns is exactly what engineer Simon Willison has been warning about since mid-2025, when he coined the term lethal trifecta ↗ for any agent that simultaneously has access to private data, exposure to untrusted content, and the ability to communicate externally. Any system that combines all three, Willison argues, is one well-placed prompt-injection payload away from exfiltrating whatever it can read.

That theoretical pattern has now been observed in production. The Krebs piece describes a supply-chain-style attack against the Cline coding assistant in which a prompt-injection payload pushed rogue agent instances onto thousands of developer machines, granting full system access without explicit user consent. It also recounts the experience of an AI safety executive whose autonomous email assistant began mass-deleting her inbox and ignored attempts to halt it remotely. In each case, the attacker did not need a memory-corruption bug or a kernel exploit. The agent’s own permissions were the exploit.

A low-skill operator, a global firewall campaign

The most consequential AI-assisted intrusion of the year so far did not involve an agent on the victim’s network at all — it involved one on the attacker’s. In a March 18 disclosure ↗, Amazon’s threat intelligence team detailed a Russian-speaking, financially motivated operator who used commercial generative-AI services to compromise more than 600 FortiGate devices across 55 countries between January 11 and February 18, 2026.

According to AWS and reporting from BleepingComputer ↗, the campaign did not exploit any new FortiGate vulnerability. The attacker scanned exposed management ports — 443, 8443, 10443, and 4443 — and then attempted credential reuse against single-factor admin logins. Where the AI tooling mattered was after that initial foothold: the operator fed configuration dumps, network reconnaissance, and Active Directory output into commercial LLMs and an orchestration tool Amazon tracked as ARXON, which produced step-by-step attack plans, including paths to Domain Admin and instructions for targeting backup systems.

Amazon assesses the operator’s underlying tradecraft as “low-to-medium” but notes the AI assistance compressed the timeline from weeks of skilled work to days. The targeting of backup infrastructure, AWS notes, is consistent with pre-positioning for ransomware deployment. Recorded Future’s reporting ↗ on the same campaign reaches similar conclusions.

The new attack surface looks like an employee

For defenders, the unifying lesson across these incidents is that agentic AI flips the threat model on its head. Static analysis, code review, and SBOM hygiene assume that “code” is a thing developers commit and “data” is a thing users provide. An LLM agent collapses that boundary: a comment in a pulled-down GitHub issue, a stray line in a vendor PDF, or a hidden instruction in a webpage can become control flow the moment an agent reads it. The OWASP-style controls security teams already have do not naturally cover this case.

Several practical implications are landing on SOC and AppSec leads now:

Agent permissions are the blast radius. An agent inheriting a developer’s GitHub token, AWS keys, and Slack scopes is functionally a privileged service account that takes instructions from anyone who can put text in front of it. Treat its scope like you would a CI/CD runner’s, not a chatbot’s.
Prompt injection is a supply-chain concern. The Cline incident shows that any agent that can install or update its own skills, plug-ins, or extensions broadens the trust boundary to every upstream feed it consumes. Skill marketplaces and “agent hubs” need the same provenance controls as package registries.
Detection telemetry is mostly missing. EDR sees the agent process, not the natural-language instructions steering it. Logging the agent’s tool-call traces, prompts, and outbound HTTP — and retaining them for IR — is becoming table stakes.
Identity is shifting. When an agent acts on behalf of a user across systems, attribution and non-repudiation get murky fast. IAM teams need to decide whether agents get their own identities and audit trails or continue impersonating humans.
The economics favor the attacker first. As the FortiGate campaign demonstrates, AI lowers the skill floor for offense before it meaningfully raises the ceiling for defense. Operators with shallow skills can now execute campaigns that previously required a team.

Vendors have started to react. Anthropic, OpenAI, and Microsoft have all shipped sandboxing improvements, scope-restricting policies, and warnings about lethal-trifecta configurations in their agent SDKs over the past quarter. But as Krebs notes in the closing of the analysis, the underlying pressure is not going away: the productivity case for agents is strong enough that adoption is outrunning governance, and the operator who treats their AI co-worker like a junior engineer with the keys to production is making a security decision whether they realize it or not.

The work for defenders is to make that decision visible — and reversible — before the next prompt-injection headline lands closer to home.

Sources

How AI Assistants are Moving the Security Goalposts ↗ — KrebsOnSecurity’s March overview of recent AI-agent failures, including the Cline supply-chain incident and exposed agent web interfaces.
AI-augmented threat actor accesses FortiGate devices at scale ↗ — Amazon Threat Intelligence’s primary disclosure of the 600-device FortiGate campaign and the role of commercial LLMs in the operator’s workflow.
The lethal trifecta for AI agents ↗ — Simon Willison’s June 2025 framing of the structural risk that combines private data, untrusted content, and external communication in a single agent.
Amazon: AI-assisted hacker breached 600 FortiGate firewalls in 5 weeks ↗ — BleepingComputer’s reporting corroborating the AWS disclosure with additional detail on the targeted ports and credential-based access path.

Agents are arriving faster than the controls to govern them

A low-skill operator, a global firewall campaign

The new attack surface looks like an employee

Sources

Sources

Subscribe