Deploying Confidential VMs: Production-Ready Implementation Guide

Meta Description: Complete guide to deploying production-grade confidential VMs on Phala Cloud, Google Cloud, and Azure. Learn best practices, security hardening, and operational strategies.

Target Keywords: deploy confidential VM, confidential VM production, TEE deployment best practices, confidential computing operations, secure VM deployment

Reading Time: 18 minutes

TL;DR — Deploying Confidential VMs

Learn how to deploy production-grade Confidential VMs across [Phala Cloud](https://docs.phala.com/phala-cloud/getting-started/overview), Google Cloud, and Azure with best practices for security, performance, and compliance.

This guide covers:

Production-ready architectures for confidential workloads
Multi-platform deployment using Phala Cloud, GCP, and Azure
Security hardening and automated [attestation](https://docs.phala.com/phala-cloud/attestation/overview) verification
High availability and disaster recovery planning
Monitoring, logging, and compliance alignment
Cost optimization for large-scale confidential infrastructure

Best For: DevOps engineers, security architects, and platform teams deploying [Confidential Computing (TEE)](https://phala.com/gpu-tee) workloads in production.

Architecture Patterns for Confidential VMs

Pattern 1: Single-Tier Confidential Application

Use Case: Standalone confidential service (API, microservice, AI model)

Deployment Steps (Phala Cloud):

Upload to dashboard
Add encrypted environment variables
Select Intel TDX (general) or GPU TEE (AI workloads)
Enable auto-restart and health monitoring

Pattern 2: Multi-Tier with Confidential Database

Use Case: Full-stack application with sensitive data persistence

Key Benefits:

Database never sees plaintext queries outside TEE
Application logic protected from cloud provider
End-to-end confidentiality from client to database

Pattern 3: High-Availability Multi-Region

Use Case: Production systems requiring 99.9%+ uptime

Implementation Considerations:

[Attestation](https://phala.com/posts/remote-attestation-deep-dive): Each instance must be independently verified
State Synchronization: Use encrypted replication between TEE instances
Failover: Automated health checks with attestation verification
Data Residency: Ensure TEE instances comply with regional regulations

Platform-Specific Deployment Guides

Deploying on Phala Cloud

#### Production Deployment Workflow

Step 1: Prepare Infrastructure-as-Code

app_name: production-api
tee_type: intel-tdx
resources:
  cpu: 8
  memory: 16GB
  storage: 100GB
  gpu: null
environment:
  - name: API_KEY
    encrypted: true
    value: "{{ vault.api_key }}"
  - name: DATABASE_URL
    encrypted: true
    value: "{{ vault.db_url }}"
networking:
  ports:
    - 8080:8080
    - 9090:9090
  custom_domain: api.example.com
monitoring:
  health_check_path: /health
  health_check_interval: 30s
  metrics_port: 9090
attestation:
  continuous_verification: true
  verification_interval: 300s

Step 2: Deploy with CLI

# Install Phala CLI
npm install -g @phala/cli
# Authenticate
phala login
# Validate configuration
phala validate phala-config.yaml
# Deploy
phala deploy \
  --config phala-config.yaml \
  --compose docker-compose.yaml \
  --region us-west \
  --environment production

Step 3: Verify Attestation

# Automated attestation check
phala verify a7f3e9c2d1b4

Step 4: Configure Custom Domain

# Add custom domain with TLS
phala domain add \
  --app a7f3e9c2d1b4 \
  --domain api.example.com \
  --tls-auto

Production Best Practices (Phala Cloud)

1. Use Encrypted Secrets Management

services:
  app:
    environment:
      - DATABASE_URL=${DB_URL}
      - API_KEY=${API_KEY}

2. Enable Continuous Attestation

import requests
import time
from datetime import datetime

def monitor_attestation(app_id):
    while True:
        try:
            response = requests.get(
                f"https://{app_id}.dstack.phala.network/.well-known/attestation"
            )
            if response.status_code == 200:
                print(f"[{datetime.now()}] ✅ Attestation valid")
            else:
                print(f"[{datetime.now()}] ❌ ALERT: Attestation failed!")
        except Exception as e:
            print(f"Error checking attestation: {e}")
        time.sleep(300)

monitor_attestation("a7f3e9c2d1b4")

3. Implement Health Checks

services:
  api:
    image: myapp:latest
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

4. Configure Logging & Metrics

# Stream logs in real-time
phala logs a7f3e9c2d1b4 --follow --timestamps

Deploying on Google Cloud

#### Production Deployment with Terraform

resource "google_compute_instance" "confidential_vm" {
  name         = "prod-confidential-api"
  machine_type = "n2d-standard-8"
  zone         = "us-central1-a"

  confidential_instance_config {
    enable_confidential_compute = true
  }

  boot_disk {
    initialize_params {
      image = "ubuntu-os-cloud/ubuntu-2204-lts"
      size  = 100
    }
  }

  network_interface {
    network = "default"
    access_config {}
  }

  metadata_startup_script = file("${path.module}/startup.sh")

  scheduling {
    on_host_maintenance = "TERMINATE"
    automatic_restart   = true
  }

  tags = ["confidential", "production"]
}

Deploying on Azure

#### Production ARM Template

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "vmName": {
      "type": "string",
      "defaultValue": "confidential-prod-vm"
    }
  },
  "resources": [
    {
      "type": "Microsoft.Compute/virtualMachines",
      "apiVersion": "2023-03-01",
      "name": "[parameters('vmName')]",
      "location": "eastus",
      "properties": {
        "hardwareProfile": {
          "vmSize": "Standard_DC8as_v5"
        },
        "securityProfile": {
          "securityType": "ConfidentialVM",
          "uefiSettings": {
            "secureBootEnabled": true,
            "vTpmEnabled": true
          }
        },
        "osProfile": {
          "computerName": "[parameters('vmName')]",
          "adminUsername": "azureuser",
          "linuxConfiguration": {
            "disablePasswordAuthentication": true,
            "ssh": {
              "publicKeys": [{
                "path": "/home/azureuser/.ssh/authorized_keys",
                "keyData": "[parameters('sshPublicKey')]"
              }]
            }
          }
        },
        "storageProfile": {
          "imageReference": {
            "publisher": "Canonical",
            "offer": "0001-com-ubuntu-confidential-vm-jammy",
            "sku": "22_04-lts-cvm",
            "version": "latest"
          },
          "osDisk": {
            "createOption": "FromImage",
            "managedDisk": {
              "storageAccountType": "Premium_LRS",
              "securityProfile": {
                "securityEncryptionType": "VMGuestStateOnly"
              }
            }
          }
        }
      }
    }
  ]
}

Security Hardening

1. Network Security

Firewall Rules (Phala Cloud)

services:
  app:
    ports:
      - "8080:8080"

Network Isolation (GCP/Azure)

# GCP: Create VPC with strict firewall
gcloud compute firewall-rules create allow-https \
  --direction=INGRESS \
  --priority=1000 \
  --network=confidential-vpc \
  --action=ALLOW \
  --rules=tcp:443 \
  --source-ranges=0.0.0.0/0 \
  --target-tags=confidential

2. Secure Boot and Image Verification

Phala Cloud: Signed Images

services:
  app:
    image: mycompany/api@sha256:abc123def456...

Verify Image in Attestation:

def verify_image_integrity(app_id, expected_image_hash):
    attestation = get_attestation_report(app_id)
    if attestation["image_hash"] != expected_image_hash:
        raise SecurityError(
            f"Image mismatch! Expected {expected_image_hash}, "
            f"got {attestation['image_hash']}"
        )
    print("✅ Image integrity verified")

3. Secrets Rotation

Automated Secret Rotation (Phala Cloud)

from phala_sdk import PhalaCloud
import hvac

def rotate_secrets(app_id):
    vault = hvac.Client(url=os.getenv("VAULT_URL"))
    new_api_key = vault.secrets.kv.v2.read_secret_version(
        path="api-keys/prod"
    )["data"]["data"]["key"]

    phala = PhalaCloud(api_key=os.getenv("PHALA_API_KEY"))
    phala.update_env_var(
        app_id=app_id,
        key="API_KEY",
        value=new_api_key,
        encrypted=True
    )
    phala.restart_app(app_id)
    print(f"✅ Secrets rotated for {app_id}")

4. Audit Logging

Comprehensive Audit Trail

import logging
import json
from datetime import datetime

audit_log = logging.getLogger('confidential-audit')
handler = logging.FileHandler('/var/log/confidential-audit.log')
handler.setFormatter(logging.Formatter(
    '%(asctime)s | %(levelname)s | %(message)s'))
audit_log.addHandler(handler)
audit_log.setLevel(logging.INFO)

def log_deployment(app_id, config):
    audit_log.info(json.dumps({
        "event": "deployment",
        "app_id": app_id,
        "timestamp": datetime.utcnow().isoformat(),
        "config": config,
        "user": os.getenv("USER")
    }))

High Availability and Disaster Recovery

Multi-Instance Deployment (Phala Cloud)

instances:
  - name: prod-api-1
    region: us-west
    tee_type: intel-tdx
    config: base-config.yaml
  - name: prod-api-2
    region: us-east
    tee_type: intel-tdx
    config: base-config.yaml
  - name: prod-api-3
    region: eu-central
    tee_type: amd-sev-snp
    config: base-config.yaml
load_balancer:
  type: round-robin
  health_check:
    path: /health
    interval: 30s
    timeout: 5s
  attestation_required: true

Backup Strategy

Encrypted Backups

# Backup encrypted volumes (Phala Cloud)
phala backup create \
  --app a7f3e9c2d1b4 \
  --volume pgdata \
  --encryption aes-256 \
  --retention 30d

Disaster Recovery Testing

# Test restore procedure
phala restore \
  --app a7f3e9c2d1b4-dr \
  --backup backup-2025-11-03-02-00 \
  --verify-attestation

Monitoring and Observability

Metrics Collection

Prometheus Integration

services:
  app:
    image: myapp:latest
    ports:
      - "8080:8080"
      - "9090:9090"
    environment:
      - PROMETHEUS_ENABLED=true
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9091:9090"

Logging Pipeline

Centralized Logging (Phala Cloud)

services:
  app:
    image: myapp:latest
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
  fluentd:
    image: fluent/fluentd:latest
    volumes:
      - ./fluent.conf:/fluentd/etc/fluent.conf
    ports:
      - "24224:24224"

Cost Optimization

Right-Sizing Instances

Performance vs. Cost Analysis

Workload Type	Recommended Instance	Phala Cloud	GCP	Azure
API (low traffic)	2 vCPU, 4GB RAM	$0.15/hr	$0.18/hr	$0.20/hr
API (medium traffic)	4 vCPU, 8GB RAM	$0.30/hr	$0.35/hr	$0.40/hr
Database	8 vCPU, 16GB RAM	$0.60/hr	$0.70/hr	$0.80/hr
AI Inference (GPU)	8 vCPU, 64GB, 1x H100	$3.50/hr	$4.00/hr	N/A

Auto-Scaling Strategy (Phala Cloud)

scaling:
  min_instances: 2
  max_instances: 10
  metrics:
    - type: cpu
      target: 70
    - type: memory
      target: 80
    - type: requests_per_second
      target: 1000
  scale_up:
    cooldown: 60s
    step: 2
  scale_down:
    cooldown: 300s
    step: 1

Reserved Instances / Committed Use

Phala Cloud Committed Pricing

# 1-year commitment for 30% discount
phala commitment create \
  --duration 12-months \
  --resources "8vcpu,16gb-ram,intel-tdx" \
  --quantity 5

Compliance and Certifications

HIPAA Compliance Example

Deployment Checklist:

- [x] All data encrypted in use (TEE)
- [x] All data encrypted in transit (TLS 1.3)
- [x] All data encrypted at rest (volume encryption)
- [x] Attestation logs retained for 6 years
- [x] Access controls enforced cryptographically
- [x] Audit trail of all data access
- [x] Business Associate Agreement with cloud provider
- [x] Regular attestation verification (every 5 minutes)
- [x] Incident response plan documented
- [x] Annual security assessment completed

Troubleshooting Common Issues

Issue 1: Attestation Failures After Updates

Symptom: Attestation verification fails after deploying new version

Solution:

# Update expected measurements in verification system
NEW_IMAGE_HASH=$(docker inspect myapp:v2 --format='{{.Id}}')
phala attestation update-policy \
  --app a7f3e9c2d1b4 \
  --expected-image-hash $NEW_IMAGE_HASH
phala verify a7f3e9c2d1b4

Issue 2: Performance Degradation

Symptom: Application slower than expected

Solutions:

# 1. Upsize instance
phala resize a7f3e9c2d1b4 --cpu 16 --memory 32GB
# 2. Use latest CPU generation
phala migrate a7f3e9c2d1b4 --tee-type intel-tdx-gen2

Issue 3: Connectivity Issues

Symptom: Cannot reach deployed application

Solution:

services:
  app:
    ports:
      - "8080:8080"

Conclusion

Deploying production confidential VMs requires careful planning across:

Architecture: Choose patterns that fit your security and availability needs
Platform: Select based on workload type (Phala for AI, GCP/Azure for general)
Security: Implement attestation, secrets management, and audit logging
Operations: Set up monitoring, backups, and incident response

Start small with a single confidential VM, validate it works, then scale to production with high availability and compliance.

Deploying Confidential VMs: Production-Ready Implementation Guide

Architecture Patterns for Confidential VMs

Pattern 1: Single-Tier Confidential Application

Pattern 2: Multi-Tier with Confidential Database

Pattern 3: High-Availability Multi-Region

Platform-Specific Deployment Guides

Deploying on Phala Cloud

Production Best Practices (Phala Cloud)

Deploying on Google Cloud

Deploying on Azure

Security Hardening

1. Network Security

2. Secure Boot and Image Verification

3. Secrets Rotation

4. Audit Logging

High Availability and Disaster Recovery

Multi-Instance Deployment (Phala Cloud)

Backup Strategy

Monitoring and Observability

Metrics Collection

Logging Pipeline

Cost Optimization

Right-Sizing Instances

Reserved Instances / Committed Use

Compliance and Certifications

HIPAA Compliance Example

Troubleshooting Common Issues

Issue 1: Attestation Failures After Updates

Issue 2: Performance Degradation

Issue 3: Connectivity Issues

Conclusion

Related Resources

Next Steps

Recent Articles

Confidential Computing Trends 2025

Phala Private AI Cloud Guide

Confidential LLMs

Recent Articles

Related Articles

Phala Private AI Cloud Guide

Confidential LLMs

Confidential Edge AI

Related Articles

Recent Articles

Confidential Computing Trends 2025

Phala Private AI Cloud Guide

Confidential LLMs

Related Articles

Phala Private AI Cloud Guide

Confidential LLMs

Confidential Edge AI