Deploying Confidential VMs Guide

5 min read
Deploying Confidential VMs Guide

Deploying Confidential VMs: Production-Ready Implementation Guide

Meta Description: Complete guide to deploying production-grade confidential VMs on Phala Cloud, Google Cloud, and Azure. Learn best practices, security hardening, and operational strategies.

Target Keywords: deploy confidential VM, confidential VM production, TEE deployment best practices, confidential computing operations, secure VM deployment

Reading Time: 18 minutes

TL;DR — Deploying Confidential VMs

Learn how to deploy production-grade Confidential VMs across [Phala Cloud](https://docs.phala.com/phala-cloud/getting-started/overview), Google Cloud, and Azure with best practices for security, performance, and compliance.

This guide covers:

  • Production-ready architectures for confidential workloads
  • Multi-platform deployment using Phala Cloud, GCP, and Azure
  • Security hardening and automated [attestation](https://docs.phala.com/phala-cloud/attestation/overview) verification
  • High availability and disaster recovery planning
  • Monitoring, logging, and compliance alignment
  • Cost optimization for large-scale confidential infrastructure

Best For: DevOps engineers, security architects, and platform teams deploying [Confidential Computing (TEE)](https://phala.com/gpu-tee) workloads in production.

Architecture Patterns for Confidential VMs

Pattern 1: Single-Tier Confidential Application

Use Case: Standalone confidential service (API, microservice, AI model)

Deployment Steps (Phala Cloud):

  • Upload to dashboard
  • Add encrypted environment variables
  • Select Intel TDX (general) or GPU TEE (AI workloads)
  • Enable auto-restart and health monitoring

Pattern 2: Multi-Tier with Confidential Database

Use Case: Full-stack application with sensitive data persistence

Key Benefits:

  • Database never sees plaintext queries outside TEE
  • Application logic protected from cloud provider
  • End-to-end confidentiality from client to database

Pattern 3: High-Availability Multi-Region

Use Case: Production systems requiring 99.9%+ uptime

Implementation Considerations:

  • [Attestation](https://phala.com/posts/remote-attestation-deep-dive): Each instance must be independently verified
  • State Synchronization: Use encrypted replication between TEE instances
  • Failover: Automated health checks with attestation verification
  • Data Residency: Ensure TEE instances comply with regional regulations

Platform-Specific Deployment Guides

Deploying on Phala Cloud

#### Production Deployment Workflow

Step 1: Prepare Infrastructure-as-Code

app_name: production-api
tee_type: intel-tdx
resources:
  cpu: 8
  memory: 16GB
  storage: 100GB
  gpu: null
environment:
  - name: API_KEY
    encrypted: true
    value: "{{ vault.api_key }}"
  - name: DATABASE_URL
    encrypted: true
    value: "{{ vault.db_url }}"
networking:
  ports:
    - 8080:8080
    - 9090:9090
  custom_domain: api.example.com
monitoring:
  health_check_path: /health
  health_check_interval: 30s
  metrics_port: 9090
attestation:
  continuous_verification: true
  verification_interval: 300s

Step 2: Deploy with CLI

# Install Phala CLI
npm install -g @phala/cli
# Authenticate
phala login
# Validate configuration
phala validate phala-config.yaml
# Deploy
phala deploy \
  --config phala-config.yaml \
  --compose docker-compose.yaml \
  --region us-west \
  --environment production

Step 3: Verify Attestation

# Automated attestation check
phala verify a7f3e9c2d1b4

Step 4: Configure Custom Domain

# Add custom domain with TLS
phala domain add \
  --app a7f3e9c2d1b4 \
  --domain api.example.com \
  --tls-auto

Production Best Practices (Phala Cloud)

1. Use Encrypted Secrets Management

services:
  app:
    environment:
      - DATABASE_URL=${DB_URL}
      - API_KEY=${API_KEY}

2. Enable Continuous Attestation

import requests
import time
from datetime import datetime

def monitor_attestation(app_id):
    while True:
        try:
            response = requests.get(
                f"https://{app_id}.dstack.phala.network/.well-known/attestation"
            )
            if response.status_code == 200:
                print(f"[{datetime.now()}] ✅ Attestation valid")
            else:
                print(f"[{datetime.now()}] ❌ ALERT: Attestation failed!")
        except Exception as e:
            print(f"Error checking attestation: {e}")
        time.sleep(300)

monitor_attestation("a7f3e9c2d1b4")

3. Implement Health Checks

services:
  api:
    image: myapp:latest
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

4. Configure Logging & Metrics

# Stream logs in real-time
phala logs a7f3e9c2d1b4 --follow --timestamps

Deploying on Google Cloud

#### Production Deployment with Terraform

resource "google_compute_instance" "confidential_vm" {
  name         = "prod-confidential-api"
  machine_type = "n2d-standard-8"
  zone         = "us-central1-a"

  confidential_instance_config {
    enable_confidential_compute = true
  }

  boot_disk {
    initialize_params {
      image = "ubuntu-os-cloud/ubuntu-2204-lts"
      size  = 100
    }
  }

  network_interface {
    network = "default"
    access_config {}
  }

  metadata_startup_script = file("${path.module}/startup.sh")

  scheduling {
    on_host_maintenance = "TERMINATE"
    automatic_restart   = true
  }

  tags = ["confidential", "production"]
}

Deploying on Azure

#### Production ARM Template

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "vmName": {
      "type": "string",
      "defaultValue": "confidential-prod-vm"
    }
  },
  "resources": [
    {
      "type": "Microsoft.Compute/virtualMachines",
      "apiVersion": "2023-03-01",
      "name": "[parameters('vmName')]",
      "location": "eastus",
      "properties": {
        "hardwareProfile": {
          "vmSize": "Standard_DC8as_v5"
        },
        "securityProfile": {
          "securityType": "ConfidentialVM",
          "uefiSettings": {
            "secureBootEnabled": true,
            "vTpmEnabled": true
          }
        },
        "osProfile": {
          "computerName": "[parameters('vmName')]",
          "adminUsername": "azureuser",
          "linuxConfiguration": {
            "disablePasswordAuthentication": true,
            "ssh": {
              "publicKeys": [{
                "path": "/home/azureuser/.ssh/authorized_keys",
                "keyData": "[parameters('sshPublicKey')]"
              }]
            }
          }
        },
        "storageProfile": {
          "imageReference": {
            "publisher": "Canonical",
            "offer": "0001-com-ubuntu-confidential-vm-jammy",
            "sku": "22_04-lts-cvm",
            "version": "latest"
          },
          "osDisk": {
            "createOption": "FromImage",
            "managedDisk": {
              "storageAccountType": "Premium_LRS",
              "securityProfile": {
                "securityEncryptionType": "VMGuestStateOnly"
              }
            }
          }
        }
      }
    }
  ]
}

Security Hardening

1. Network Security

Firewall Rules (Phala Cloud)

services:
  app:
    ports:
      - "8080:8080"

Network Isolation (GCP/Azure)

# GCP: Create VPC with strict firewall
gcloud compute firewall-rules create allow-https \
  --direction=INGRESS \
  --priority=1000 \
  --network=confidential-vpc \
  --action=ALLOW \
  --rules=tcp:443 \
  --source-ranges=0.0.0.0/0 \
  --target-tags=confidential

2. Secure Boot and Image Verification

Phala Cloud: Signed Images

services:
  app:
    image: mycompany/api@sha256:abc123def456...

Verify Image in Attestation:

def verify_image_integrity(app_id, expected_image_hash):
    attestation = get_attestation_report(app_id)
    if attestation["image_hash"] != expected_image_hash:
        raise SecurityError(
            f"Image mismatch! Expected {expected_image_hash}, "
            f"got {attestation['image_hash']}"
        )
    print("✅ Image integrity verified")

3. Secrets Rotation

Automated Secret Rotation (Phala Cloud)

from phala_sdk import PhalaCloud
import hvac

def rotate_secrets(app_id):
    vault = hvac.Client(url=os.getenv("VAULT_URL"))
    new_api_key = vault.secrets.kv.v2.read_secret_version(
        path="api-keys/prod"
    )["data"]["data"]["key"]

    phala = PhalaCloud(api_key=os.getenv("PHALA_API_KEY"))
    phala.update_env_var(
        app_id=app_id,
        key="API_KEY",
        value=new_api_key,
        encrypted=True
    )
    phala.restart_app(app_id)
    print(f"✅ Secrets rotated for {app_id}")

4. Audit Logging

Comprehensive Audit Trail

import logging
import json
from datetime import datetime

audit_log = logging.getLogger('confidential-audit')
handler = logging.FileHandler('/var/log/confidential-audit.log')
handler.setFormatter(logging.Formatter(
    '%(asctime)s | %(levelname)s | %(message)s'))
audit_log.addHandler(handler)
audit_log.setLevel(logging.INFO)

def log_deployment(app_id, config):
    audit_log.info(json.dumps({
        "event": "deployment",
        "app_id": app_id,
        "timestamp": datetime.utcnow().isoformat(),
        "config": config,
        "user": os.getenv("USER")
    }))

High Availability and Disaster Recovery

Multi-Instance Deployment (Phala Cloud)

instances:
  - name: prod-api-1
    region: us-west
    tee_type: intel-tdx
    config: base-config.yaml
  - name: prod-api-2
    region: us-east
    tee_type: intel-tdx
    config: base-config.yaml
  - name: prod-api-3
    region: eu-central
    tee_type: amd-sev-snp
    config: base-config.yaml
load_balancer:
  type: round-robin
  health_check:
    path: /health
    interval: 30s
    timeout: 5s
  attestation_required: true

Backup Strategy

Encrypted Backups

# Backup encrypted volumes (Phala Cloud)
phala backup create \
  --app a7f3e9c2d1b4 \
  --volume pgdata \
  --encryption aes-256 \
  --retention 30d

Disaster Recovery Testing

# Test restore procedure
phala restore \
  --app a7f3e9c2d1b4-dr \
  --backup backup-2025-11-03-02-00 \
  --verify-attestation

Monitoring and Observability

Metrics Collection

Prometheus Integration

services:
  app:
    image: myapp:latest
    ports:
      - "8080:8080"
      - "9090:9090"
    environment:
      - PROMETHEUS_ENABLED=true
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9091:9090"

Logging Pipeline

Centralized Logging (Phala Cloud)

services:
  app:
    image: myapp:latest
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
  fluentd:
    image: fluent/fluentd:latest
    volumes:
      - ./fluent.conf:/fluentd/etc/fluent.conf
    ports:
      - "24224:24224"

Cost Optimization

Right-Sizing Instances

Performance vs. Cost Analysis

Workload TypeRecommended InstancePhala CloudGCPAzure
API (low traffic)2 vCPU, 4GB RAM$0.15/hr$0.18/hr$0.20/hr
API (medium traffic)4 vCPU, 8GB RAM$0.30/hr$0.35/hr$0.40/hr
Database8 vCPU, 16GB RAM$0.60/hr$0.70/hr$0.80/hr
AI Inference (GPU)8 vCPU, 64GB, 1x H100$3.50/hr$4.00/hrN/A

Auto-Scaling Strategy (Phala Cloud)

scaling:
  min_instances: 2
  max_instances: 10
  metrics:
    - type: cpu
      target: 70
    - type: memory
      target: 80
    - type: requests_per_second
      target: 1000
  scale_up:
    cooldown: 60s
    step: 2
  scale_down:
    cooldown: 300s
    step: 1

Reserved Instances / Committed Use

Phala Cloud Committed Pricing

# 1-year commitment for 30% discount
phala commitment create \
  --duration 12-months \
  --resources "8vcpu,16gb-ram,intel-tdx" \
  --quantity 5

Compliance and Certifications

HIPAA Compliance Example

Deployment Checklist:

- [x] All data encrypted in use (TEE)
- [x] All data encrypted in transit (TLS 1.3)
- [x] All data encrypted at rest (volume encryption)
- [x] Attestation logs retained for 6 years
- [x] Access controls enforced cryptographically
- [x] Audit trail of all data access
- [x] Business Associate Agreement with cloud provider
- [x] Regular attestation verification (every 5 minutes)
- [x] Incident response plan documented
- [x] Annual security assessment completed

Troubleshooting Common Issues

Issue 1: Attestation Failures After Updates

Symptom: Attestation verification fails after deploying new version

Solution:

# Update expected measurements in verification system
NEW_IMAGE_HASH=$(docker inspect myapp:v2 --format='{{.Id}}')
phala attestation update-policy \
  --app a7f3e9c2d1b4 \
  --expected-image-hash $NEW_IMAGE_HASH
phala verify a7f3e9c2d1b4

Issue 2: Performance Degradation

Symptom: Application slower than expected

Solutions:

# 1. Upsize instance
phala resize a7f3e9c2d1b4 --cpu 16 --memory 32GB
# 2. Use latest CPU generation
phala migrate a7f3e9c2d1b4 --tee-type intel-tdx-gen2

Issue 3: Connectivity Issues

Symptom: Cannot reach deployed application

Solution:

services:
  app:
    ports:
      - "8080:8080"

Conclusion

Deploying production confidential VMs requires careful planning across:

  • Architecture: Choose patterns that fit your security and availability needs
  • Platform: Select based on workload type (Phala for AI, GCP/Azure for general)
  • Security: Implement attestation, secrets management, and audit logging
  • Operations: Set up monitoring, backups, and incident response

Start small with a single confidential VM, validate it works, then scale to production with high availability and compliance.

Next Steps

Recent Articles

Related Articles