Phala Private AI Cloud Guide

5 min read
Phala Private AI Cloud Guide

Phala Confidential AI Cloud: Complete Platform Guide

TL;DR: Phala Confidential AI Cloud is a TEE-based confidential AI cloud, purpose-built for secure AI workloads with production-ready GPU TEE. Deploy LLMs, machine learning models, and sensitive workloads with hardware-enforced privacy via NVIDIA H100/H200 TEE, Intel TDX, and AMD SEV-SNP. Dstack SDK simplifies deployment (“Docker for TEE”), Trust Center provides public attestation verification, and production-ready infrastructure delivers 95-99% of native performance. The only cloud platform offering public, cryptographically verifiable privacy via attestation, not promises.

Platform Overview

What is Phala Confidential Cloud?

Phala Confidential AI Cloud is a TEE-based private AI cloud for secure AI workloads (LLMs, training, analytics) with public attestation.

Phala Cloud differentiators:

FeatureTraditional Cloud (AWS, Azure, GCP)Phala Confidential Cloud
PrivacyTrust-based (cloud provider access)Hardware-enforced by default (zero-trust architecture)
Confidential ComputingOptional add-on (limited support)Platform foundation (all workloads)
GPU TEERoadmap or pilot (not production-ready)Production-ready (H100/H200, first to market)
AttestationInternal only (customers can't verify)Public via Trust Center (customer verification)
Developer ExperienceComplex TEE setupDstack SDK (Docker-simple deployment)
Use CaseGeneral purpose (confidentiality as afterthought)Confidential AI specialist (privacy-first design)

Core Platform Components

*Architecture of Phala’s TEE-based confidential AI cloud: NVIDIA H100/H200 GPU TEE, Intel TDX/AMD SEV-SNP; Trust Center for public attestation.*

Layer Breakdown:

  1. Hardware Infrastructure (Layer 1)
    • GPU TEE: NVIDIA H100 (80GB HBM3), H200 (141GB HBM3e)
    • CPU TEE: Intel TDX (4th/5th Gen Xeon), AMD SEV-SNP (EPYC Genoa)
    • Networking: TEE-aware encrypted transit
    • Storage: Encrypted with TEE integration
  2. TEE Runtime (Layer 2)
    • GPU Runtime: NVIDIA Confidential Computing SDK
    • CPU Runtime: TDX/SEV-SNP virtual machines
    • Attestation: Remote attestation generation
    • Key Management: TEE-integrated KMS
    • Memory Encryption: AES-256 hardware encryption
  3. Orchestration (Layer 3)
    • Kubernetes with TEE awareness
    • TEE-aware workload placement
    • Autoscaling for confidential workloads
    • Service mesh with attestation
  4. Developer Platform (Layer 4)
    • Dstack SDK: Container → TEE deployment automation
    • REST API: Programmatic access
    • CLI: Command-line interface
    • Web Console: Visual deployment management
    • CI/CD Integration: GitHub Actions, GitLab CI/CD
  5. Trust Infrastructure (Layer 5)
    • Trust Center: Public attestation verification
    • Attestation Registry: Historical logs
    • Compliance Reporting: HIPAA, SOC 2, PCI-DSS
    • Audit Logging: Comprehensive trails
  6. Application Services (Layer 6)
    • LLM Inference: Optimized confidential LLM serving
    • Model Training: Confidential training pipelines
    • Data Analytics: Privacy-preserving analytics
    • API Gateway: Managed endpoints with attestation

Getting Started with Phala Cloud

Account Setup

Get started with Phala’s confidential AI cloud in under 10 minutes — from signup to first TEE deployment:

Step-by-step breakdown:

  1. Sign Up → Create account at cloud.phala.network
  2. Get API Key → Generate from dashboard settings
  3. Authenticate → Initialize PhalaClient with API key
  4. Create Deployment → Name your first confidential workload
  5. Configure Settings:
    • Container image: python:3.11-slim (or any Docker image)
    • TEE type: Intel TDX or AMD SEV
    • Resources: 2 CPU cores, 4GB memory
    • Attestation: Publish to Trust Center
  6. Deploy to TEE → Platform provisions confidential compute environment
  7. Wait for Ready → Deployment status changes to “running” (~2-5 minutes)
  8. Verify Attestation:
    • Check TEE type confirmation
    • Verify trust level (hardware-backed proof)
    • Access public verification URL
  9. View Output → See workload logs and results
    • *Expected output:*
      • Status: running
    • Attestation:
      • TEE Type: tdx
      • Trust Level: high
      • Verification URL: https://trust.phala.cloud/verify/dep_abc123
    • Workload Output:
      • Hello from Phala Confidential Cloud!
      • Running in TEE: Hardware-enforced privacy
      • Platform: Linux-5.15.0-confidential-x86_64-with-glibc2.35

✓ First confidential workload deployed successfully

Dstack SDK Deep Dive

"Docker for TEE" - Container to TEE automation:

> Note: The following is conceptual code showing the deployment workflow. For actual deployment, use the Phala Cloud Dashboard or Phala Cloud CLI.

# Conceptual example showing the deployment workflow
# Actual deployment uses Phala Cloud Dashboard or CLI

# Step 1: Prepare your Docker application
# docker-compose.yml with your services

# Step 2: Deploy via Phala Cloud Dashboard
# 1. Go to https://cloud.phala.network/dashboard/cvm
# 2. Upload docker-compose.yml
# 3. Configure TEE type (Intel TDX, AMD SEV-SNP, GPU TEE)
# 4. Set environment variables and secrets
# 5. Click "Deploy"

# Step 3: Access your deployment
# Your app will be available at:
# https://<app-id>.dstack.phala.network

Dstack SDK key features:

FeatureBefore DstackWith Dstack SDK
Time-to-deployment3-6 months<1 hour
Expertise RequiredTEE specialists, security engineersAny developer familiar with Docker
WorkflowComplex TEE setupStandard Docker workflow
PortabilityLimitedMulti-TEE portability (TDX, SEV-SNP, H100 same API)

Trust Center: Attestation Infrastructure

Trust Center: Public Attestation for Confidential AI Cloud

A public attestation platform so customers can verify TEE type, trust level, and measurements — no need to ‘trust the vendor’.

Capabilities:

  • Verification: Cryptographic proof of TEE configuration
  • Historical Attestation: Immutable logs for compliance
  • Continuous Monitoring: Real-time TEE status
  • Public API: Customer self-service verification

Example Customer Verification:

import requests

attestation_url = "https://trust.phala.cloud/verify/dep_abc123"
response = requests.get(attestation_url)
attestation = response.json()

if attestation['verified']:
    print("✓ TEE verified: Customer data protected by hardware")
else:
    print("✗ TEE verification failed")

Trust Center for Sales Enablement

Using public attestation to win enterprise customers:

Traditional SaaS SalesPhala Cloud Sales
Customer skepticismCustomer verifies independently
Lengthy security reviewsSimplified with attestation
Slow deal closureFaster, trust-based differentiation

Production Deployment Patterns

High-Availability Confidential LLM

Enterprise-grade confidential AI infrastructure:

> Note: For production LLM deployment, use Phala Cloud's Confidential AI API or deploy custom models via CVM.

# High-availability LLM deployment via Phala Cloud Dashboard
# Deploy using docker-compose.yml:

version: "3"
services:
  llm-service:
    image: your-org/llama-70b-server:latest
    ports:
      - "8000:8000"
    environment:
      - MODEL_PATH=/models/llama-3.1-70b
      - MAX_BATCH_SIZE=32
      - ENABLE_STREAMING=true
    volumes:
      - /var/run/dstack.sock:/var/run/dstack.sock  # For attestation
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 4  # 4x H100 GPUs
              capabilities: [gpu]

# Deploy configuration:
# TEE Type: NVIDIA H100 GPU TEE
# Regions: Multi-region deployment via dashboard
# Autoscaling: Configure in dashboard settings
# Attestation: Enable public attestation verification

Multi-Tenant Confidential SaaS

SaaS platform with per-customer TEE isolation:

> Note: Deploy isolated TEE instances per customer using Phala Cloud Dashboard or automate via API/CLI.

# Per-customer TEE deployment configuration
# Create separate CVM deployment for each customer

# Basic Tier Customer
app_name: customer-acme-basic
image: your-company/analytics-platform:latest
tee_type: intel-tdx
resources:
  cpu_cores: 4
  memory_gb: 16
  storage_gb: 100
attestation:
  public: true
  continuous_verification: true

---

# Professional Tier Customer
app_name: customer-acme-pro
image: your-company/analytics-platform:latest
tee_type: intel-tdx
resources:
  cpu_cores: 16
  memory_gb: 64
  storage_gb: 500
attestation:
  public: true
  continuous_verification: true

---

# Enterprise Tier Customer (with GPU TEE)
app_name: customer-acme-enterprise
image: your-company/analytics-platform:latest
tee_type: h100_tee
resources:
  cpu_cores: 32
  memory_gb: 128
  gpu_count: 1  # H100 GPU TEE
  storage_gb: 1000
attestation:
  public: true
  continuous_verification: true

Pricing and Cost Optimization

Pricing Model

Phala Confidential AI Cloud pricing — usage-based for GPU TEE and CPU TEE.

ResourceHourly RateMonthly ReservedTEE PremiumUse Case
H100 TEE$4.50/GPU/hour$2,700/GPU/month10%Confidential LLM inference, training
H200 TEE$6.00/GPU/hour$3,600/GPU/month12%Large LLM inference
TDX$0.12/vCPU/hour + $0.02/GB RAM/hour50% discount15%General confidential computing
SEV-SNP$0.11/vCPU/hour + $0.018/GB RAM/hour50% discount12%AMD-optimized workloads

Cost Optimization Strategies

Reducing Phala Cloud costs:

  • Reserved Capacity: Commit to 1-year for base load, saving 40%
  • Autoscaling Tuning: Scale down aggressively during low traffic, saving 25-35%
  • Model Optimization: Quantization, pruning, distillation, saving 50%
  • Request Batching: Increase throughput by 30%
  • Multi-Region Optimization: Deploy in cheaper regions, saving 15%

Compliance and Enterprise Features

Enterprise-Grade Compliance

Phala Cloud compliance certifications:

Completed CertificationsIn Progress (2025)TEE-Specific Advantages
SOC 2 Type IIFedRAMP HighAttestation simplifies audits
ISO 27001ISO 27017/27018Hardware-enforced controls
GDPR CompliantITAR RegistrationPublic verification
HIPAA EligibleRegional certificationsContinuous compliance monitoring
PCI-DSS Level 1

Enterprise features:

  • Dedicated Infrastructure: Customer-specific hardware pools
  • Private Cloud Deployment: Phala Cloud in your datacenter
  • Custom Attestation Policies: Organization-specific TEE requirements
  • Advanced Key Management: Bring-your-own-key (BYOK)
  • Dedicated Support: 99.95% uptime guarantee
  • Custom SLAs: Up to 99.99% uptime

Roadmap and Future Developments

2025-2027 Platform Roadmap

YearKey Developments
2025H200 TEE general availability, Multi-region expansion, Dstack SDK 1.0
2026Multi-modal confidential AI, Confidential databases, Zero-knowledge compute
2027TEE becomes default, Industry consolidation, Regulatory requirement

Conclusion

Why Phala Confidential Cloud?

Key differentiators:

Technology LeadershipBusiness BenefitsEcosystem
First-to-market GPU TEEFaster time-to-marketOpen source: Dstack SDK
Dstack SDKSales enablementStandards: CCC member
Trust CenterCompliance simplifiedIntegrations: Hugging Face
PerformanceCompetitive moatCommunity: Active developer community

Who should use Phala Cloud:

  1. Healthcare AI companies - HIPAA-compliant AI without compromise
  2. Financial services - Protect proprietary models and customer data
  3. Enterprise SaaS - Privacy-preserving multi-tenant platforms
  4. Government contractors - FedRAMP-ready confidential computing
  5. AI startups - Privacy as competitive differentiator
  6. Researchers - Collaborative AI without data sharing

Getting Started

Deploy confidential AI in minutes using Phala Cloud's pre-configured LLMs:

import openai

# Use Phala's Confidential AI API (OpenAI-compatible)
client = openai.OpenAI(
    api_key="your-phala-api-key",  # Get from https://cloud.phala.network
    base_url="https://api.phala.network/v1"
)

# Your queries run in GPU TEE - fully confidential
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Hello from confidential AI!"}]
)

print(f"Response: {response.choices[0].message.content}")
# Get attestation proof: https://proof.t16z.com/reports/<checksum>

For custom deployments: Use Phala Cloud Dashboard to deploy Docker applications with TEE protection.

Resources:

What’s Next in Your Learning Journey?

Recommended reading order:

  1. New to confidential computing?
  1. Ready to deploy?
  1. Industry-specific:
  1. Advanced topics:

Start building today:

Deploy on Phala Cloud - Free tier available, production-ready infrastructure.

Next Steps

Recent Articles

Related Articles