AI Agents Can Be Weaponized
In 2024, researchers at the University of Illinois Urbana-Champaign demonstrated that a GPT-4-based agent could autonomously exploit 87% of real-world one-day vulnerabilities when given nothing more than the CVE description. Other models — including GPT-3.5 and open-source alternatives — scored 0%. The research showed that frontier AI models don't just assist with security — they can autonomously attack systems with a success rate that would make most penetration testing firms uncomfortable.
This isn't a theoretical concern for healthcare. When AI agents are deployed in clinical workflows — processing patient data, connecting to EHR APIs, managing billing pipelines — every one of those agents is a potential attack surface. And unlike traditional software vulnerabilities, agent-based attacks can be dynamic, context-aware, and difficult to detect with conventional security tooling.
The healthcare industry is walking into this reality largely unprepared.
Where PHI Leaks in AI-Powered Workflows
Most healthcare teams think about HIPAA compliance at the infrastructure layer — encrypted databases, BAA-covered hosting, access controls on the EHR. But when AI agents enter the workflow, PHI can leak through channels that traditional compliance frameworks never anticipated:
Prompt logs and LLM context windows. Every time an agent processes a patient record, that data enters the model's context window. If the LLM provider retains prompt data for training or debugging — and many do by default — patient data is now sitting in a third-party system without a BAA. This is a HIPAA violation, and most teams don't even know it's happening.
Agent-to-agent communication in multi-step workflows. When a clinical intake agent passes data to an eligibility verification agent, which then passes results to a billing agent, PHI is flowing through multiple handoff points. If any of those handoffs lack encryption, access controls, or audit logging, the data flow is non-compliant.
Vector stores and retrieval pipelines. RAG systems that pull from clinical guidelines, formularies, or patient history can inadvertently embed PHI in vector embeddings. Those embeddings may not be covered under BAAs, and the original text can potentially be reconstructed through embedding inversion techniques.
Output caching and response storage. Agents that cache responses for performance optimization may be storing PHI in memory or disk caches that aren't encrypted, aren't access-controlled, and aren't included in audit trail systems.
Error logs and debugging output. When an agent fails — and agents fail regularly — the error context often includes the input data that caused the failure. If that input contained PHI, the error log is now a compliance liability.
The Numbers Tell the Story
The scale of healthcare data vulnerability is staggering:
- 605 healthcare breaches were reported to HHS in 2025, affecting 44.3 million Americans
- Breach volume doubled — total breach incidents in 2025 surpassed 2024 by 112%, according to Fortified Health Security
- 97% of organizations with AI-related security incidents lacked proper AI access controls (IBM, 2025)
- 50% of healthcare organizations lack confidence in their ability to detect and manage data breaches
- 42% of healthcare organizations have no policies for preventing unauthorized data access
- 279 days — the average time to detect and contain a healthcare data breach, five weeks longer than any other industry
And these numbers predate the widespread adoption of AI agents in clinical workflows. As agents become more autonomous — sending messages, modifying records, triggering billing actions — the attack surface expands exponentially.
The OWASP LLM Top 10 Meets Healthcare
OWASP released its Top 10 for LLM Applications (2025 edition) specifically to address the security risks that emerge when large language models are deployed in production. The list includes:
- Prompt Injection — the most fundamental LLM vulnerability, where crafted inputs cause models to ignore original instructions
- Sensitive Information Disclosure — surged to the second most critical risk, with PHI exposure being the healthcare-specific nightmare scenario
- Supply Chain Vulnerabilities — compromised dependencies in AI pipelines, like the PoisonGPT attack that bypassed Hugging Face safety features
- Excessive Agency — agents with permissions beyond their task scope, the risk that grows with every tool an agent can access
- System Prompt Leakage — new for 2025, focusing on exposure of internal instructions that may contain credentials or operational logic
- Vector and Embedding Weaknesses — also new for 2025, targeting the RAG pipelines that healthcare agents increasingly rely on
For healthcare specifically, OWASP's Agentic Top 10 (released December 2025) goes further. It addresses threats like multi-agent collusion, identity spoofing between agents, and the cascade effects when one compromised agent in a swarm passes tainted context to downstream agents. Microsoft, NVIDIA, and AWS have already begun referencing these agentic threat models in their own security frameworks.
What Compliance Actually Requires for AI Agents
HIPAA doesn't have a section titled "AI Agent Requirements." But the existing rules — properly interpreted — impose clear obligations on any system processing PHI, including AI agents:
The Security Rule requires:
- Access controls on every system component that touches PHI — including agent processes
- Audit controls that log every instance of PHI access — including when an agent reads, transforms, or transmits patient data
- Integrity controls that protect PHI from improper alteration — including when an agent generates outputs based on patient records
- Transmission security for PHI in transit — including agent-to-agent communication and API calls to LLM providers
The Privacy Rule requires:
- Minimum necessary standards — agents should only access the PHI required for their specific task, not entire patient records
- Consent and authorization tracking — agents automating patient communication must respect consent preferences
- Accounting of disclosures — if an agent shares PHI with a third-party service, that disclosure must be logged
The Breach Notification Rule requires:
- Detection capabilities for unauthorized PHI access — including detecting when an agent behaves anomalously
- 60-day notification timelines — which means agent monitoring needs to be real-time, not retrospective
The Path Forward
Healthcare organizations deploying AI agents need a security architecture that goes beyond traditional infrastructure compliance:
Layer 1: Code and infrastructure scanning. Run DAST/SAST tools like Invicti against your deployed endpoints. Scan for the traditional OWASP Top 10. This is table stakes.
Layer 2: Agent output validation. Before any agent response enters a dataset or downstream system, validate it against deterministic checks — clinical code set verification, FHIR schema conformance, business rule compliance. This catches hallucinated data before it becomes a patient safety issue.
Layer 3: Inter-agent trust boundaries. Every agent-to-agent handoff should include schema validation, provenance verification, and scope checking. No agent should blindly trust input from another agent.
Layer 4: Retrieval integrity. Pin retrieval sources to known-good authorities. Hash-verify documents before indexing. Version your vector stores. Never let agent outputs write back into the retrieval corpus without human review.
The teams that implement these layers now — before the first AI-related healthcare breach makes national headlines — will be the ones that earn and keep patient trust.
The teams that don't will be the case studies the rest of us learn from.