HIPAA and AI: A Practical Engineering Guide
Deploying AI in healthcare means navigating HIPAA at every layer of the stack. This is the engineering guide we wish existed when we started building healthcare AI systems — covering BAAs, PHI data flows, model training constraints, and audit logging.

HIPAA was written in 1996, before AI, cloud computing, or modern software architecture existed. Yet it governs every AI system that touches protected health information. The challenge is not that HIPAA prohibits AI — it does not. The challenge is that HIPAA's technical safeguard requirements were written for a world of on-premise databases, and applying them to AI pipelines that span cloud services, model APIs, and edge inference requires careful architectural translation.
The PHI Data Flow Problem
The first step in HIPAA-compliant AI engineering is mapping every path that PHI travels through your system. This includes obvious paths (patient records in your database) and non-obvious paths (LLM prompts containing patient names, model training data derived from clinical notes, log files that capture request payloads containing PHI, error messages that include patient identifiers).
Most HIPAA violations in AI systems happen in the non-obvious paths. A debugging log that captures the full LLM prompt — including the patient history that was injected as context — is a PHI exposure if that log is stored without encryption or transmitted without access controls.
PHI Flow Mapping for AI Systems
Trace PHI from ingestion through processing to output. Document every service, API, database, and queue that PHI touches.
Where does PHI go when something fails? Error logs, dead letter queues, retry stores, exception tracking services — all potential PHI exposure points.
If you fine-tune models on clinical data, the training pipeline is a PHI processing system. The model weights themselves may constitute PHI if the training data can be extracted.
Distributed traces, application logs, and metrics that include request context can capture PHI. Ensure your observability stack either excludes PHI or meets HIPAA safeguards.
Every third-party service that PHI touches requires a BAA. Cloud provider, LLM API, logging service, error tracking — all of them.
LLM API Considerations
Using third-party LLM APIs (OpenAI, Anthropic, Google) with PHI requires specific configurations. Most providers offer HIPAA-eligible endpoints with BAAs, but these endpoints often have restrictions: no data retention for training, specific API versions, and sometimes higher pricing. Azure OpenAI Service and AWS Bedrock provide HIPAA-eligible LLM access within their respective cloud compliance frameworks.
The critical architectural decision is whether to use API-based inference or self-hosted models. API-based inference is simpler to operate but requires trusting the provider's PHI handling. Self-hosted models (running open-source models on your own HIPAA-compliant infrastructure) give you complete control over PHI but dramatically increase operational complexity.
| Approach | PHI Control | Operational Burden | BAA Required | Cost |
|---|---|---|---|---|
| OpenAI API (HIPAA-eligible) | Provider-managed | Low | Yes, enterprise plan | $$ |
| Azure OpenAI | Azure-managed | Low-medium | Yes, Azure BAA | $$ |
| AWS Bedrock | AWS-managed | Low-medium | Yes, AWS BAA | $$ |
| Self-hosted (open-source) | Full control | Very high | No (you are the host) | $$$+ |
Technical Safeguards Checklist
- Encryption at rest for all datastores containing PHI (AES-256 or equivalent)
- Encryption in transit for all PHI transmission (TLS 1.2+ minimum)
- Unique user identification for every human and service account accessing PHI
- Automatic session timeout for interactive PHI access
- Audit logging for every PHI access, modification, and deletion — with tamper-evident storage
- Emergency access procedures for break-glass PHI access scenarios
- PHI backup and recovery with encryption and access controls matching production
- Integrity controls to detect unauthorized PHI modification
Audit Logging for AI
HIPAA requires audit logs for PHI access. In an AI system, this means logging every time PHI is used as input to a model, every time model output contains PHI, and every time a human reviews AI-generated clinical content. The logs themselves must be protected — stored with encryption, access-controlled, and retained for six years.
The practical challenge is volume. An AI system processing thousands of clinical documents daily generates massive audit logs. Design your audit infrastructure for scale from day one — append-only log stores, efficient compression, and automated retention management.
“HIPAA compliance in AI is not a feature you add — it is an architectural constraint that shapes every decision from model selection to logging infrastructure. Retrofitting HIPAA compliance onto an existing AI system is an order of magnitude harder than building it in from the start.”
Business Associate Agreements: What They Actually Require
A Business Associate Agreement (BAA) is a contract required by HIPAA when a covered entity (hospital, insurer, healthcare provider) shares Protected Health Information with a vendor (your software, your cloud provider, your analytics tool). The BAA does not make HIPAA compliance happen — it documents that both parties understand their obligations and assigns liability for breaches.
What a BAA must contain under 45 CFR § 164.504(e): permitted uses and disclosures of PHI, a requirement that the business associate will not use PHI outside those permitted uses, safeguard requirements (administrative, physical, technical), breach notification obligations to the covered entity within 60 days of discovery, and a provision requiring the business associate to pass these obligations down to subcontractors who also handle PHI.
The subcontractor chain is where most engineering teams get caught out. If you process PHI and you use AWS for compute, you need a BAA with AWS. But you also need BAAs with your database provider, your logging service, your error monitoring tool, your customer support platform — any service that might process or store PHI. AWS, GCP, and Azure all offer BAAs. Most major SaaS tools (Datadog, Twilio, Segment, Intercom) also offer BAAs, but you must specifically request them and enable HIPAA-eligible service tiers.
A common engineering mistake: using a service that does not offer a BAA (e.g., a standard Slack workspace or a free-tier analytics tool) in a workflow that touches PHI. Using a non-BAA service for PHI — even incidentally — is a HIPAA violation. Audit your full data flow, including error messages and logs that might contain PHI snippets. For supply chain visibility practices that apply equally here, see our AI dependency audit guide.
AWS HIPAA-Eligible Services and GCP Healthcare API
AWS maintains a list of HIPAA-eligible services under its Business Associate Addendum (BAA). As of 2025, the list includes core services that most healthcare applications use: EC2, S3, RDS, Aurora, DynamoDB, Lambda, ECS, EKS, API Gateway, CloudWatch, CloudTrail, KMS, Secrets Manager, Cognito, and others. Notably absent from the eligible list: some newer services and third-party integrations from the AWS Marketplace. Always verify against the current AWS HIPAA eligible services page before adding a new service to a PHI workflow.
| Cloud provider | BAA process | Key eligible services | Notable gaps |
|---|---|---|---|
| AWS | Accept online via console (Healthcare compliance section) | EC2, S3, RDS, Lambda, EKS, KMS, CloudWatch, CloudTrail | Not all Marketplace products; verify each service |
| GCP | Request via sales/support for Healthcare API; separate BAA | Healthcare API (FHIR, DICOM, HL7v2), Cloud Storage, BigQuery, GKE, Cloud Run | Standard GCP services require separate BAA coverage verification |
| Azure | Included in Microsoft Online Services BAA (accept in portal) | Azure Blob, SQL, Kubernetes, Functions, Key Vault, Monitor | Some preview services excluded; check Azure compliance docs |
GCP Healthcare API provides native FHIR R4, DICOM, and HL7v2 store/query capabilities with audit logging built in. For AI applications that process clinical notes, imaging, or lab results, the Healthcare API is the most compliant path — it handles consent, de-identification, and FHIR resource validation natively. The trade-off is cost and lock-in: Healthcare API is significantly more expensive than storing the same data in Cloud Storage with custom code.
PHI Handling Patterns: Encryption, Access Logging, and Minimum Necessary
- Access controls: unique user identification, automatic logoff, encryption/decryption capability
- Audit controls: hardware/software activity logging in systems containing PHI — must be retained 6 years
- Integrity controls: mechanisms to ensure PHI is not improperly altered or destroyed
- Transmission security: encryption of PHI in transit (TLS 1.2+ minimum; TLS 1.3 recommended)
- Encryption at rest: addressable standard — industry practice is AES-256; justify any deviation in writing
Encryption at rest is technically "addressable" under HIPAA, meaning you can document a reason for not implementing it. In practice, no reasonable security program skips encryption at rest in 2025. Use AWS KMS or GCP KMS with customer-managed keys (CMKs) for PHI data stores. This gives you key rotation control and the ability to effectively delete data by destroying the key.
Access logging is non-negotiable. Every access to PHI — reads, writes, deletes — must be logged with user identity, timestamp, and the resource accessed. In AWS, CloudTrail covers API-level access; enable S3 server access logging and RDS audit logging separately. Store audit logs in a separate account or bucket with write-once permissions (S3 Object Lock or similar) so that a compromised application cannot delete its own access trail.
AI-Specific HIPAA Concerns: LLM APIs and PHI
The growth of LLM-powered healthcare applications creates a compliance surface that HIPAA's 2003 drafters did not anticipate. The core rule is simple: if you send PHI to an LLM API, you need a BAA with that provider and you need to know how they handle your data. In 2025, the vendor landscape is split:
| LLM provider | HIPAA BAA available | PHI handling policy | Recommended approach |
|---|---|---|---|
| Azure OpenAI (GPT-4/o) | Yes (covered under Azure BAA) | Data not used for training on enterprise tiers | Preferred for HIPAA-covered PHI workloads |
| AWS Bedrock (Claude, Titan, etc.) | Yes (covered under AWS BAA) | No data retention or training on AWS Bedrock | Strong option; verify specific model BAA coverage |
| Google Cloud Vertex AI | Yes (covered under GCP Healthcare BAA extension) | No training on customer data; CMEK support | Good option; confirm scope of BAA coverage |
| OpenAI API (direct) | Enterprise plan only; not standard API | Enterprise: zero data retention; standard: data may be retained 30 days | Only via Enterprise with executed BAA |
| Anthropic API (direct) | Not publicly available as of 2025 | Standard data handling policy applies | Do not send PHI without confirmed BAA |
Fine-tuning on medical data creates additional concerns. Under HIPAA, using PHI to fine-tune a model is a "use" of PHI that must fall within a permitted purpose. Research and public health are permitted purposes with proper authorisation; improving a commercial product is not self-evidently permitted. Before fine-tuning any model on PHI, consult healthcare legal counsel and document the permitted purpose in your policies.
The zero-trust principle is the safest engineering stance: treat every LLM API call as a potential PHI leak surface and design your data flows to minimise PHI exposure. De-identify or pseudonymise data before sending it to LLM APIs wherever the use case allows. For governance frameworks that formalise this approach, see our guide on AI governance and the NIST RMF.
De-Identification: Safe Harbor vs Expert Determination
De-identified data is not PHI under HIPAA, which means it falls outside the BAA requirement and handling restrictions. HIPAA provides two methods for de-identification. Safe Harbor requires removing 18 specific identifiers (names, dates finer than year, geographic data smaller than state, phone numbers, email addresses, SSNs, medical record numbers, IP addresses, device identifiers, URLs, full-face photographs, and any unique identifiers). Expert Determination requires a qualified statistician to certify that the risk of re-identification is very small.
Safe Harbor is mechanistic and auditor-friendly — either the identifiers are removed or they are not. Expert Determination offers more flexibility (you can keep more data attributes) but requires formal statistical analysis and documentation. For most engineering teams, Safe Harbor via an automated de-identification pipeline is the practical choice. NLP-based de-identification tools (AWS Comprehend Medical, Google Healthcare Natural Language API, Microsoft Presidio) can detect and redact PHI entities from unstructured clinical text with 90-98% recall.
- Names (patient, family members, employers)
- Geographic subdivisions smaller than state (street, city, county, zip except first 3 digits)
- Dates (except year) — including admission, discharge, birth, death dates
- Phone numbers, fax numbers
- Email addresses
- Social Security numbers
- Medical record, health plan, account numbers
- Certificate and license numbers
- VIN and vehicle serial numbers
- Device identifiers and serial numbers
- URLs and IP addresses
- Biometric identifiers (finger and voice prints)
- Full-face photos and comparable images
- Any unique identifying number, code, or characteristic
Breach Notification Timelines
HIPAA breach notification has three layers: to affected individuals (within 60 days of discovery), to HHS (within 60 days for breaches affecting 500+ individuals; annual report for smaller breaches), and to prominent media outlets in affected states for breaches affecting 500+ residents of that state. Discovery is defined as when the organisation knew or reasonably should have known about the breach — not when it was reported to security or legal.
A breach is defined as the acquisition, access, use, or disclosure of PHI in a manner not permitted under HIPAA. Importantly, impermissible disclosure is presumed to be a breach unless the covered entity or business associate can demonstrate a low probability that PHI was compromised using a four-factor risk assessment: nature of PHI involved, who made the unauthorised use, whether PHI was actually acquired or viewed, and extent to which risk has been mitigated.
Engineer your incident response playbook to start the 60-day clock from the moment you detect any anomalous PHI access, not from when legal concludes their risk assessment. The risk assessment determines whether notification is required — the 60-day clock does not pause during the assessment. For the broader incident response and zero-trust architecture that supports HIPAA operations, see our zero-trust architecture guide.
Incident Response for PHI Breaches
HIPAA breach notification has specific timelines that differ from general security incident response. If a breach of unsecured PHI is discovered, the covered entity must notify affected individuals within 60 calendar days of discovery — not 60 days from when the breach occurred, but from when it was discovered or reasonably should have been discovered. For breaches affecting 500 or more individuals, the entity must also notify HHS and prominent media outlets in the affected state within the same 60-day window. Breaches affecting fewer than 500 individuals are reported to HHS annually.
The definition of "unsecured PHI" is critical: PHI that has been encrypted with NIST-approved algorithms (AES-128, AES-256) or destroyed is considered "secured" and does not trigger notification requirements even if the encrypted data is exposed. This is the single strongest argument for encrypting PHI at rest and in transit — it converts a potential breach notification event into a security incident that can be handled internally. The cost difference is enormous: a notifiable breach involving 10,000 records typically costs $500K-2M in notification, legal, and remediation costs. An encryption-protected exposure costs the internal investigation time and nothing more.
AI Model Training on Healthcare Data
Training AI models on healthcare data introduces HIPAA obligations that go beyond standard PHI handling. When PHI enters a training dataset, every copy of that dataset — including intermediate processing artifacts, model weights that memorise training data, and evaluation datasets derived from the training data — is subject to HIPAA protection. This means your ML training infrastructure must meet the same security and access control requirements as your production PHI storage.
The practical implication: you cannot train on PHI using consumer-grade cloud ML services that are not covered by a BAA. AWS SageMaker, Google Vertex AI, and Azure ML are all HIPAA-eligible when configured correctly and covered by a BAA. Databricks, Weights & Biases, and other ML platform tools require individual BAA evaluation. If your model training pipeline includes any service without a BAA, you have a HIPAA violation regardless of whether the PHI was de-identified at other points in the pipeline — the obligation follows the data through every processing step.
Need this kind of thinking applied to your product?
We build AI agents, full-stack platforms, and engineering systems. Same depth, applied to your problem.
Enjoyed this? Get the weekly digest.
Research highlights and AI news, delivered every Thursday. No spam.
Related articles

Building for SOC 2 Compliance from Day One

Zero-Trust Architecture for AI-Native Applications

Before You Deploy an AI Agent: 12 Things Engineers Skip

AI Governance Frameworks: NIST AI RMF vs EU AI Act
