On-Premises LLM Deployment for HIPAA & GDPR Compliance
For healthcare, fintech, and European companies, the LLM compliance question isn't primarily about cost — it's about what data can legally leave your infrastructure, and under what conditions.
Shubham Yadav
Machine Learning Researcher
Most of the conversation around LLM deployment is framed as a cost and performance question. For healthcare providers, health tech companies, financial institutions, and businesses operating in Europe, that framing misses the central issue. Before you can optimize cost per token, you have to determine which tokens you're legally permitted to send outside your infrastructure at all.
Quick answer: HIPAA requires a signed BAA with any LLM API provider that processes PHI. GDPR requires a DPA and appropriate cross-border transfer mechanism. Major providers (OpenAI Enterprise, Azure OpenAI, Anthropic, AWS Bedrock, GCP Vertex AI) offer both for enterprise customers. On-premises deployment is required when hard data residency laws apply, fine-tuning involves regulated data, or organizational risk tolerance doesn't permit third-party transmission of sensitive data.
What Does HIPAA Require for LLM Deployments?
HIPAA requires a signed Business Associate Agreement (BAA) with any vendor that creates, receives, maintains, or transmits protected health information (PHI) on your behalf — including LLM API providers. Sending PHI to an LLM API without a BAA in place is a HIPAA violation, regardless of what the provider's privacy policy says.
What counts as PHI in an LLM context?
PHI is broadly defined: any individually identifiable health information, including names, dates, geographic identifiers, contact information, or any data that could identify a specific patient in connection with their health status or care. Sending a patient conversation transcript to an LLM API constitutes PHI transmission if that transcript contains identifiable information. Using an LLM to analyze medical records, clinical notes, insurance claims, or billing data almost certainly involves PHI.
What does a BAA require from the vendor?
The BAA commits the vendor to protecting PHI according to HIPAA standards, restricting use of the data to the stated purpose, reporting breaches within required timeframes, and deleting or returning PHI upon contract termination.
What else does HIPAA require beyond the BAA?
- Encryption of PHI at rest and in transit (required under the Security Rule)
- Role-based access controls limiting PHI access to authorized individuals
- Audit logging of all access to, creation of, and modification of PHI systems
- Breach notification within 60 days of discovery
The audit logging requirement has direct implications for LLM deployments. You need records of what data was sent to the model, when, by whom, and for what purpose — in a format HIPAA auditors expect.
What Does GDPR Require for LLM API Use?
GDPR requires a Data Processing Agreement (DPA) with any vendor processing personal data of EU residents on your behalf, plus a valid cross-border transfer mechanism if the vendor is based outside the European Economic Area.
GDPR applies to any processing of personal data belonging to EU residents, regardless of where the processing organization is located. A US health tech company with EU customers, a global financial institution processing EU employee data, and a European SaaS company are all covered.
The key GDPR obligations for LLM deployments:
Data Processing Agreements (DPAs). Similar to HIPAA's BAA requirement, sending personal data to an LLM API without a DPA is a GDPR violation. The DPA commits the vendor to GDPR-compliant handling.
Cross-border transfer mechanisms. Personal data of EU residents cannot be transferred to non-EEA countries without an appropriate legal basis. For US-based providers, the current mechanisms are Standard Contractual Clauses (SCCs) — contractual terms approved by the European Commission — or the EU-US Data Privacy Framework (DPF) for certified US companies. SCCs require a transfer impact assessment evaluating whether destination country laws undermine the protections.
Data subject rights. GDPR gives individuals rights that are technically challenging for LLMs: right to access, right to erasure, and right to rectification. LLMs that process personal data raise complex questions about how these rights can be exercised.
Purpose limitation and data minimization. Personal data should be collected for specified, explicit purposes and not processed in incompatible ways. Using customer service conversation data to fine-tune a general-purpose model may not be consistent with the purpose for which that data was collected.
What Is the EU AI Act and How Does It Apply to LLM Deployments?
The EU AI Act categorizes AI systems by risk level and imposes additional requirements on high-risk systems — including AI used in healthcare, employment decisions, credit scoring, and certain educational applications.
High-risk systems face conformity assessments, technical documentation requirements, human oversight mechanisms, and registration requirements. Systems that interact with humans must disclose that they are AI. The Act applies to providers and deployers of AI systems in the EU market, regardless of where the company is based.
A company building an LLM-powered HR tool for EU-based employees is likely subject to high-risk provisions. The compliance requirements are non-trivial and require documentation that most teams building on general-purpose LLM APIs haven't yet created. The EU AI Act came into force in 2024, with phased enforcement timelines. Organizations in regulated industries should begin building compliance documentation now.
Do Major LLM API Providers Offer HIPAA BAAs and GDPR DPAs?
Yes — all major LLM API providers offer BAAs and DPAs for enterprise customers, but coverage depth varies.
| Provider | HIPAA BAA | GDPR DPA | EU Data Residency | Notes |
|---|---|---|---|---|
| OpenAI Enterprise API | Yes | Yes | Limited | BAA available; EU residency more limited vs. Azure |
| Azure OpenAI | Yes | Yes | Yes | Most complete coverage; uses Microsoft's enterprise compliance framework |
| Anthropic Enterprise | Yes | Yes | Limited | BAA and DPA available; fewer regional options |
| AWS Bedrock | Yes | Yes | Yes | Covered under AWS general BAA; EU regions available |
| GCP Vertex AI | Yes | Yes | Yes | HIPAA BAA; EU residency through Google Cloud EU regions |
What these agreements don't always cover:
- Model training on your data. Most enterprise agreements explicitly prohibit using your inputs for model training — verify before signing.
- Data retention at the provider. Understand how long the provider retains request data and under what circumstances provider personnel can access it.
- Sub-processors. GDPR DPAs require disclosure of sub-processors. LLM providers may use third-party infrastructure that affects your compliance posture.
- Jurisdictional complexity. A BAA covers the HIPAA dimension. It doesn't resolve data sovereignty requirements in countries that mandate data never leave national borders — France, Germany, and others for certain data types.
When Is On-Premises LLM Deployment Required for Compliance?
On-premises or private LLM deployment is required in four scenarios where API provider agreements are insufficient:
Hard data residency laws. Some jurisdictions require data processing infrastructure to be physically located in-country and under direct organizational control. German public sector data protection law, French HDS certification (health data sovereignty), and sector-specific regulations in EU financial services can require this. "EU region" cloud deployment may not satisfy these requirements.
PHI that cannot be de-identified. Where the clinical workflow requires the LLM to process identifying information — a clinical decision support system reasoning about a specific patient's history — and the organization cannot sustain the BAA and audit infrastructure required for compliant third-party transmission, on-premises processing may be the only viable path.
Fine-tuning on regulated data. If LLM performance requires fine-tuning on data that is itself subject to regulatory restrictions — proprietary clinical protocols, confidential financial models, customer data under GDPR — that training data cannot leave your infrastructure. Fine-tuning must happen on-premises or in a dedicated private cloud environment.
Competitive and strategic data sensitivity. Organizations whose core IP is expressed in the data they would send to an LLM may prefer not to transmit it to a third-party provider regardless of contractual protections. This is common in financial services, where trading strategies, risk models, and client data represent direct competitive value.
What Does HIPAA-Compliant On-Premises LLM Infrastructure Require?
A compliant on-premises LLM deployment requires four controls layered on top of the serving infrastructure: audit logging, access controls, network isolation, and data minimization.
Audit Logging
Every LLM request involving regulated data needs to be logged with sufficient detail to satisfy an audit. Log the hash of the input — not the raw PHI — to avoid creating a secondary PHI store in the logging system:
import logging
import uuid
from datetime import datetime, timezone
def log_llm_request(user_id: str, request_id: str, model: str,
prompt_hash: str, input_tokens: int,
output_tokens: int, purpose: str):
audit_logger.info({
"event": "llm_request",
"timestamp": datetime.now(timezone.utc).isoformat(),
"request_id": request_id,
"user_id": user_id, # who initiated the request
"model": model, # which model processed the data
"prompt_hash": prompt_hash, # hash of input, not raw PHI
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"purpose": purpose, # documented business purpose
"data_classification": "PHI",
})
Audit logs must be immutable, retained for 6 years (HIPAA minimum), and access to them must itself be logged.
Access Controls
Not everyone with application access should have access to PHI-processing LLM endpoints:
from functools import wraps
from typing import Callable
def require_phi_access(f: Callable) -> Callable:
@wraps(f)
async def wrapper(*args, **kwargs):
user = get_current_user()
if not user.has_permission("llm:phi_processing"):
audit_logger.warning({
"event": "unauthorized_phi_access_attempt",
"user_id": user.id,
"endpoint": f.__name__,
"timestamp": datetime.now(timezone.utc).isoformat(),
})
raise PermissionError("PHI processing access required")
return await f(*args, **kwargs)
return wrapper
Network Isolation
The model server should be network-isolated, with inbound traffic routed through an authenticated gateway and outbound connections blocked entirely:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: llm-serving-isolation
spec:
podSelector:
matchLabels:
app: vllm-server
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: llm-gateway # only the gateway can reach the model
egress: [] # no outbound connections from the model server
Data Minimization
Strip fields from prompts that aren't necessary for the specific task — a GDPR requirement and good defense-in-depth:
def prepare_clinical_prompt(patient_record: dict, task: str) -> str:
relevant_fields = {
"age": patient_record.get("age"),
"diagnoses": patient_record.get("diagnoses"),
"medications": patient_record.get("medications"),
"lab_values": patient_record.get("lab_values"),
# Excluded: name, DOB, MRN, SSN, address — not needed for clinical reasoning
}
return build_prompt(task, relevant_fields)
Frequently Asked Questions: LLM Compliance
Does HIPAA prohibit using LLM APIs like OpenAI or Anthropic?
No. HIPAA prohibits sending PHI to any vendor without a signed BAA. OpenAI, Anthropic, AWS Bedrock, and Google Vertex AI all offer BAAs for enterprise API customers. Using these services for PHI is permissible with a signed BAA and the required access controls and audit logging implemented on your end.
Can I use a cloud LLM API and still comply with GDPR?
Yes, if the provider offers a GDPR DPA and your cross-border transfer mechanism is in place (SCCs or EU-US DPF). Azure OpenAI, AWS Bedrock, and GCP Vertex AI offer EU data residency options that keep processing within the EEA, which simplifies the cross-border transfer analysis considerably.
Does de-identifying data before sending it to an LLM API satisfy HIPAA?
Yes — de-identified data is no longer PHI under HIPAA's Safe Harbor or Expert Determination methods. If your workflow doesn't require patient-specific reasoning, de-identification plus API is often the simplest compliant path. GDPR pseudonymization (reversible) reduces but doesn't eliminate obligations; true anonymization (irreversible) removes GDPR scope entirely.
What's the difference between GDPR pseudonymization and anonymization?
Pseudonymization replaces direct identifiers with a pseudonym but retains a key that can re-identify individuals — it remains personal data under GDPR. Anonymization irreversibly removes the ability to re-identify individuals — the data falls outside GDPR scope. Most tokenization approaches in production are pseudonymization, not anonymization.
Is on-premises LLM deployment required for EU data sovereignty?
Not always. EU region cloud deployment satisfies most GDPR cross-border transfer requirements. Hard data residency mandates — applicable to certain French, German, and sector-specific EU financial data — may require infrastructure physically in-country and under direct organizational control, which cloud regional deployments may not satisfy. Engage legal counsel to map specific requirements before choosing an architecture.
When does the EU AI Act apply to my LLM application?
The EU AI Act applies if you are providing or deploying an AI system to users in the EU market, regardless of where your company is based. High-risk classification depends on the application domain — healthcare, employment, credit, education, and certain law enforcement applications are high-risk by default. General-purpose AI model providers face additional transparency obligations under the Act.
Keep reading
Run LLMs Locally vs OpenAI API: Real Cost Comparison
Every team scaling an LLM product eventually runs this comparison. Most get it wrong because they only count compute. Here's the full cost stack — and the exact token volume where the math flips.
vLLM vs Ollama vs TGI: LLM Serving Framework Comparison
A framework decision that's easy to get wrong — they look similar on the surface but are built for fundamentally different use cases. Plus a step-by-step guide to running Llama 4 Scout on a single GPU.
Run 70B Models on a Single RTX 4090 With 4-Bit Quantization
A how-to for fitting a 70B-parameter model onto one 24 GB RTX 4090 using aggressive 4-bit and 2-bit quantization — what works, what breaks, and the accuracy cost.