
Deploying Confidential VMs: Production-Ready Implementation Guide
Meta Description: Complete guide to deploying production-grade confidential VMs on Phala Cloud, Google Cloud, and Azure. Learn best practices, security hardening, and operational strategies.
Target Keywords: deploy confidential VM, confidential VM production, TEE deployment best practices, confidential computing operations, secure VM deployment
Reading Time: 18 minutes
TL;DR — Deploying Confidential VMs
Learn how to deploy production-grade Confidential VMs across [Phala Cloud](https://docs.phala.com/phala-cloud/getting-started/overview), Google Cloud, and Azure with best practices for security, performance, and compliance.
This guide covers:
- Production-ready architectures for confidential workloads
- Multi-platform deployment using Phala Cloud, GCP, and Azure
- Security hardening and automated [attestation](https://docs.phala.com/phala-cloud/attestation/overview) verification
- High availability and disaster recovery planning
- Monitoring, logging, and compliance alignment
- Cost optimization for large-scale confidential infrastructure
Best For: DevOps engineers, security architects, and platform teams deploying [Confidential Computing (TEE)](https://phala.com/gpu-tee) workloads in production.
Architecture Patterns for Confidential VMs
Pattern 1: Single-Tier Confidential Application
Use Case: Standalone confidential service (API, microservice, AI model)
Deployment Steps (Phala Cloud):
- Upload to dashboard
- Add encrypted environment variables
- Select Intel TDX (general) or GPU TEE (AI workloads)
- Enable auto-restart and health monitoring
Pattern 2: Multi-Tier with Confidential Database
Use Case: Full-stack application with sensitive data persistence
Key Benefits:
- Database never sees plaintext queries outside TEE
- Application logic protected from cloud provider
- End-to-end confidentiality from client to database
Pattern 3: High-Availability Multi-Region
Use Case: Production systems requiring 99.9%+ uptime
Implementation Considerations:
- [Attestation](https://phala.com/posts/remote-attestation-deep-dive): Each instance must be independently verified
- State Synchronization: Use encrypted replication between TEE instances
- Failover: Automated health checks with attestation verification
- Data Residency: Ensure TEE instances comply with regional regulations
Platform-Specific Deployment Guides
Deploying on Phala Cloud
#### Production Deployment Workflow
Step 1: Prepare Infrastructure-as-Code
app_name: production-api
tee_type: intel-tdx
resources:
cpu: 8
memory: 16GB
storage: 100GB
gpu: null
environment:
- name: API_KEY
encrypted: true
value: "{{ vault.api_key }}"
- name: DATABASE_URL
encrypted: true
value: "{{ vault.db_url }}"
networking:
ports:
- 8080:8080
- 9090:9090
custom_domain: api.example.com
monitoring:
health_check_path: /health
health_check_interval: 30s
metrics_port: 9090
attestation:
continuous_verification: true
verification_interval: 300sStep 2: Deploy with CLI
# Install Phala CLI
npm install -g @phala/cli
# Authenticate
phala login
# Validate configuration
phala validate phala-config.yaml
# Deploy
phala deploy \
--config phala-config.yaml \
--compose docker-compose.yaml \
--region us-west \
--environment productionStep 3: Verify Attestation
# Automated attestation check
phala verify a7f3e9c2d1b4Step 4: Configure Custom Domain
# Add custom domain with TLS
phala domain add \
--app a7f3e9c2d1b4 \
--domain api.example.com \
--tls-autoProduction Best Practices (Phala Cloud)
1. Use Encrypted Secrets Management
services:
app:
environment:
- DATABASE_URL=${DB_URL}
- API_KEY=${API_KEY}2. Enable Continuous Attestation
import requests
import time
from datetime import datetime
def monitor_attestation(app_id):
while True:
try:
response = requests.get(
f"https://{app_id}.dstack.phala.network/.well-known/attestation"
)
if response.status_code == 200:
print(f"[{datetime.now()}] ✅ Attestation valid")
else:
print(f"[{datetime.now()}] ❌ ALERT: Attestation failed!")
except Exception as e:
print(f"Error checking attestation: {e}")
time.sleep(300)
monitor_attestation("a7f3e9c2d1b4")3. Implement Health Checks
services:
api:
image: myapp:latest
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
4. Configure Logging & Metrics
# Stream logs in real-time
phala logs a7f3e9c2d1b4 --follow --timestampsDeploying on Google Cloud
#### Production Deployment with Terraform
resource "google_compute_instance" "confidential_vm" {
name = "prod-confidential-api"
machine_type = "n2d-standard-8"
zone = "us-central1-a"
confidential_instance_config {
enable_confidential_compute = true
}
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2204-lts"
size = 100
}
}
network_interface {
network = "default"
access_config {}
}
metadata_startup_script = file("${path.module}/startup.sh")
scheduling {
on_host_maintenance = "TERMINATE"
automatic_restart = true
}
tags = ["confidential", "production"]
}Deploying on Azure
#### Production ARM Template
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"vmName": {
"type": "string",
"defaultValue": "confidential-prod-vm"
}
},
"resources": [
{
"type": "Microsoft.Compute/virtualMachines",
"apiVersion": "2023-03-01",
"name": "[parameters('vmName')]",
"location": "eastus",
"properties": {
"hardwareProfile": {
"vmSize": "Standard_DC8as_v5"
},
"securityProfile": {
"securityType": "ConfidentialVM",
"uefiSettings": {
"secureBootEnabled": true,
"vTpmEnabled": true
}
},
"osProfile": {
"computerName": "[parameters('vmName')]",
"adminUsername": "azureuser",
"linuxConfiguration": {
"disablePasswordAuthentication": true,
"ssh": {
"publicKeys": [{
"path": "/home/azureuser/.ssh/authorized_keys",
"keyData": "[parameters('sshPublicKey')]"
}]
}
}
},
"storageProfile": {
"imageReference": {
"publisher": "Canonical",
"offer": "0001-com-ubuntu-confidential-vm-jammy",
"sku": "22_04-lts-cvm",
"version": "latest"
},
"osDisk": {
"createOption": "FromImage",
"managedDisk": {
"storageAccountType": "Premium_LRS",
"securityProfile": {
"securityEncryptionType": "VMGuestStateOnly"
}
}
}
}
}
}
]
}Security Hardening
1. Network Security
Firewall Rules (Phala Cloud)
services:
app:
ports:
- "8080:8080"Network Isolation (GCP/Azure)
# GCP: Create VPC with strict firewall
gcloud compute firewall-rules create allow-https \
--direction=INGRESS \
--priority=1000 \
--network=confidential-vpc \
--action=ALLOW \
--rules=tcp:443 \
--source-ranges=0.0.0.0/0 \
--target-tags=confidential2. Secure Boot and Image Verification
Phala Cloud: Signed Images
services:
app:
image: mycompany/api@sha256:abc123def456...Verify Image in Attestation:
def verify_image_integrity(app_id, expected_image_hash):
attestation = get_attestation_report(app_id)
if attestation["image_hash"] != expected_image_hash:
raise SecurityError(
f"Image mismatch! Expected {expected_image_hash}, "
f"got {attestation['image_hash']}"
)
print("✅ Image integrity verified")3. Secrets Rotation
Automated Secret Rotation (Phala Cloud)
from phala_sdk import PhalaCloud
import hvac
def rotate_secrets(app_id):
vault = hvac.Client(url=os.getenv("VAULT_URL"))
new_api_key = vault.secrets.kv.v2.read_secret_version(
path="api-keys/prod"
)["data"]["data"]["key"]
phala = PhalaCloud(api_key=os.getenv("PHALA_API_KEY"))
phala.update_env_var(
app_id=app_id,
key="API_KEY",
value=new_api_key,
encrypted=True
)
phala.restart_app(app_id)
print(f"✅ Secrets rotated for {app_id}")4. Audit Logging
Comprehensive Audit Trail
import logging
import json
from datetime import datetime
audit_log = logging.getLogger('confidential-audit')
handler = logging.FileHandler('/var/log/confidential-audit.log')
handler.setFormatter(logging.Formatter(
'%(asctime)s | %(levelname)s | %(message)s'))
audit_log.addHandler(handler)
audit_log.setLevel(logging.INFO)
def log_deployment(app_id, config):
audit_log.info(json.dumps({
"event": "deployment",
"app_id": app_id,
"timestamp": datetime.utcnow().isoformat(),
"config": config,
"user": os.getenv("USER")
}))High Availability and Disaster Recovery
Multi-Instance Deployment (Phala Cloud)
instances:
- name: prod-api-1
region: us-west
tee_type: intel-tdx
config: base-config.yaml
- name: prod-api-2
region: us-east
tee_type: intel-tdx
config: base-config.yaml
- name: prod-api-3
region: eu-central
tee_type: amd-sev-snp
config: base-config.yaml
load_balancer:
type: round-robin
health_check:
path: /health
interval: 30s
timeout: 5s
attestation_required: trueBackup Strategy
Encrypted Backups
# Backup encrypted volumes (Phala Cloud)
phala backup create \
--app a7f3e9c2d1b4 \
--volume pgdata \
--encryption aes-256 \
--retention 30dDisaster Recovery Testing
# Test restore procedure
phala restore \
--app a7f3e9c2d1b4-dr \
--backup backup-2025-11-03-02-00 \
--verify-attestationMonitoring and Observability
Metrics Collection
Prometheus Integration
services:
app:
image: myapp:latest
ports:
- "8080:8080"
- "9090:9090"
environment:
- PROMETHEUS_ENABLED=true
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9091:9090"Logging Pipeline
Centralized Logging (Phala Cloud)
services:
app:
image: myapp:latest
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
fluentd:
image: fluent/fluentd:latest
volumes:
- ./fluent.conf:/fluentd/etc/fluent.conf
ports:
- "24224:24224"Cost Optimization
Right-Sizing Instances
Performance vs. Cost Analysis
| Workload Type | Recommended Instance | Phala Cloud | GCP | Azure |
| API (low traffic) | 2 vCPU, 4GB RAM | $0.15/hr | $0.18/hr | $0.20/hr |
| API (medium traffic) | 4 vCPU, 8GB RAM | $0.30/hr | $0.35/hr | $0.40/hr |
| Database | 8 vCPU, 16GB RAM | $0.60/hr | $0.70/hr | $0.80/hr |
| AI Inference (GPU) | 8 vCPU, 64GB, 1x H100 | $3.50/hr | $4.00/hr | N/A |
Auto-Scaling Strategy (Phala Cloud)
scaling:
min_instances: 2
max_instances: 10
metrics:
- type: cpu
target: 70
- type: memory
target: 80
- type: requests_per_second
target: 1000
scale_up:
cooldown: 60s
step: 2
scale_down:
cooldown: 300s
step: 1Reserved Instances / Committed Use
Phala Cloud Committed Pricing
# 1-year commitment for 30% discount
phala commitment create \
--duration 12-months \
--resources "8vcpu,16gb-ram,intel-tdx" \
--quantity 5Compliance and Certifications
HIPAA Compliance Example
Deployment Checklist:
- [x] All data encrypted in use (TEE)
- [x] All data encrypted in transit (TLS 1.3)
- [x] All data encrypted at rest (volume encryption)
- [x] Attestation logs retained for 6 years
- [x] Access controls enforced cryptographically
- [x] Audit trail of all data access
- [x] Business Associate Agreement with cloud provider
- [x] Regular attestation verification (every 5 minutes)
- [x] Incident response plan documented
- [x] Annual security assessment completedTroubleshooting Common Issues
Issue 1: Attestation Failures After Updates
Symptom: Attestation verification fails after deploying new version
Solution:
# Update expected measurements in verification system
NEW_IMAGE_HASH=$(docker inspect myapp:v2 --format='{{.Id}}')
phala attestation update-policy \
--app a7f3e9c2d1b4 \
--expected-image-hash $NEW_IMAGE_HASH
phala verify a7f3e9c2d1b4Issue 2: Performance Degradation
Symptom: Application slower than expected
Solutions:
# 1. Upsize instance
phala resize a7f3e9c2d1b4 --cpu 16 --memory 32GB
# 2. Use latest CPU generation
phala migrate a7f3e9c2d1b4 --tee-type intel-tdx-gen2Issue 3: Connectivity Issues
Symptom: Cannot reach deployed application
Solution:
services:
app:
ports:
- "8080:8080"Conclusion
Deploying production confidential VMs requires careful planning across:
- Architecture: Choose patterns that fit your security and availability needs
- Platform: Select based on workload type (Phala for AI, GCP/Azure for general)
- Security: Implement attestation, secrets management, and audit logging
- Operations: Set up monitoring, backups, and incident response
Start small with a single confidential VM, validate it works, then scale to production with high availability and compliance.