
Phala Confidential AI Cloud: Complete Platform Guide
TL;DR: Phala Confidential AI Cloud is a TEE-based confidential AI cloud, purpose-built for secure AI workloads with production-ready GPU TEE. Deploy LLMs, machine learning models, and sensitive workloads with hardware-enforced privacy via NVIDIA H100/H200 TEE, Intel TDX, and AMD SEV-SNP. Dstack SDK simplifies deployment (“Docker for TEE”), Trust Center provides public attestation verification, and production-ready infrastructure delivers 95-99% of native performance. The only cloud platform offering public, cryptographically verifiable privacy via attestation, not promises.
Platform Overview
What is Phala Confidential Cloud?
Phala Confidential AI Cloud is a TEE-based private AI cloud for secure AI workloads (LLMs, training, analytics) with public attestation.
Phala Cloud differentiators:
| Feature | Traditional Cloud (AWS, Azure, GCP) | Phala Confidential Cloud |
| Privacy | Trust-based (cloud provider access) | Hardware-enforced by default (zero-trust architecture) |
| Confidential Computing | Optional add-on (limited support) | Platform foundation (all workloads) |
| GPU TEE | Roadmap or pilot (not production-ready) | Production-ready (H100/H200, first to market) |
| Attestation | Internal only (customers can't verify) | Public via Trust Center (customer verification) |
| Developer Experience | Complex TEE setup | Dstack SDK (Docker-simple deployment) |
| Use Case | General purpose (confidentiality as afterthought) | Confidential AI specialist (privacy-first design) |
Core Platform Components
*Architecture of Phala’s TEE-based confidential AI cloud: NVIDIA H100/H200 GPU TEE, Intel TDX/AMD SEV-SNP; Trust Center for public attestation.*
Layer Breakdown:
- Hardware Infrastructure (Layer 1)
- GPU TEE: NVIDIA H100 (80GB HBM3), H200 (141GB HBM3e)
- CPU TEE: Intel TDX (4th/5th Gen Xeon), AMD SEV-SNP (EPYC Genoa)
- Networking: TEE-aware encrypted transit
- Storage: Encrypted with TEE integration
- TEE Runtime (Layer 2)
- GPU Runtime: NVIDIA Confidential Computing SDK
- CPU Runtime: TDX/SEV-SNP virtual machines
- Attestation: Remote attestation generation
- Key Management: TEE-integrated KMS
- Memory Encryption: AES-256 hardware encryption
- Orchestration (Layer 3)
- Kubernetes with TEE awareness
- TEE-aware workload placement
- Autoscaling for confidential workloads
- Service mesh with attestation
- Developer Platform (Layer 4)
- Dstack SDK: Container → TEE deployment automation
- REST API: Programmatic access
- CLI: Command-line interface
- Web Console: Visual deployment management
- CI/CD Integration: GitHub Actions, GitLab CI/CD
- Trust Infrastructure (Layer 5)
- Trust Center: Public attestation verification
- Attestation Registry: Historical logs
- Compliance Reporting: HIPAA, SOC 2, PCI-DSS
- Audit Logging: Comprehensive trails
- Application Services (Layer 6)
- LLM Inference: Optimized confidential LLM serving
- Model Training: Confidential training pipelines
- Data Analytics: Privacy-preserving analytics
- API Gateway: Managed endpoints with attestation
Getting Started with Phala Cloud
Account Setup
Get started with Phala’s confidential AI cloud in under 10 minutes — from signup to first TEE deployment:
Step-by-step breakdown:
- Sign Up → Create account at cloud.phala.network
- Get API Key → Generate from dashboard settings
- Authenticate → Initialize PhalaClient with API key
- Create Deployment → Name your first confidential workload
- Configure Settings:
- Container image:
python:3.11-slim(or any Docker image) - TEE type: Intel TDX or AMD SEV
- Resources: 2 CPU cores, 4GB memory
- Attestation: Publish to Trust Center
- Container image:
- Deploy to TEE → Platform provisions confidential compute environment
- Wait for Ready → Deployment status changes to “running” (~2-5 minutes)
- Verify Attestation:
- Check TEE type confirmation
- Verify trust level (hardware-backed proof)
- Access public verification URL
- View Output → See workload logs and results
- *Expected output:*
- Status: running
- Attestation:
- TEE Type: tdx
- Trust Level: high
- Verification URL: https://trust.phala.cloud/verify/dep_abc123
- Workload Output:
- Hello from Phala Confidential Cloud!
- Running in TEE: Hardware-enforced privacy
- Platform: Linux-5.15.0-confidential-x86_64-with-glibc2.35
- *Expected output:*
✓ First confidential workload deployed successfully
Dstack SDK Deep Dive
"Docker for TEE" - Container to TEE automation:
> Note: The following is conceptual code showing the deployment workflow. For actual deployment, use the Phala Cloud Dashboard or Phala Cloud CLI.
# Conceptual example showing the deployment workflow
# Actual deployment uses Phala Cloud Dashboard or CLI
# Step 1: Prepare your Docker application
# docker-compose.yml with your services
# Step 2: Deploy via Phala Cloud Dashboard
# 1. Go to https://cloud.phala.network/dashboard/cvm
# 2. Upload docker-compose.yml
# 3. Configure TEE type (Intel TDX, AMD SEV-SNP, GPU TEE)
# 4. Set environment variables and secrets
# 5. Click "Deploy"
# Step 3: Access your deployment
# Your app will be available at:
# https://<app-id>.dstack.phala.networkDstack SDK key features:
| Feature | Before Dstack | With Dstack SDK |
| Time-to-deployment | 3-6 months | <1 hour |
| Expertise Required | TEE specialists, security engineers | Any developer familiar with Docker |
| Workflow | Complex TEE setup | Standard Docker workflow |
| Portability | Limited | Multi-TEE portability (TDX, SEV-SNP, H100 same API) |
Trust Center: Attestation Infrastructure
Trust Center: Public Attestation for Confidential AI Cloud
A public attestation platform so customers can verify TEE type, trust level, and measurements — no need to ‘trust the vendor’.
Capabilities:
- Verification: Cryptographic proof of TEE configuration
- Historical Attestation: Immutable logs for compliance
- Continuous Monitoring: Real-time TEE status
- Public API: Customer self-service verification
Example Customer Verification:
import requests
attestation_url = "https://trust.phala.cloud/verify/dep_abc123"
response = requests.get(attestation_url)
attestation = response.json()
if attestation['verified']:
print("✓ TEE verified: Customer data protected by hardware")
else:
print("✗ TEE verification failed")Trust Center for Sales Enablement
Using public attestation to win enterprise customers:
| Traditional SaaS Sales | Phala Cloud Sales |
| Customer skepticism | Customer verifies independently |
| Lengthy security reviews | Simplified with attestation |
| Slow deal closure | Faster, trust-based differentiation |
Production Deployment Patterns
High-Availability Confidential LLM
Enterprise-grade confidential AI infrastructure:
> Note: For production LLM deployment, use Phala Cloud's Confidential AI API or deploy custom models via CVM.
# High-availability LLM deployment via Phala Cloud Dashboard
# Deploy using docker-compose.yml:
version: "3"
services:
llm-service:
image: your-org/llama-70b-server:latest
ports:
- "8000:8000"
environment:
- MODEL_PATH=/models/llama-3.1-70b
- MAX_BATCH_SIZE=32
- ENABLE_STREAMING=true
volumes:
- /var/run/dstack.sock:/var/run/dstack.sock # For attestation
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 4 # 4x H100 GPUs
capabilities: [gpu]
# Deploy configuration:
# TEE Type: NVIDIA H100 GPU TEE
# Regions: Multi-region deployment via dashboard
# Autoscaling: Configure in dashboard settings
# Attestation: Enable public attestation verificationMulti-Tenant Confidential SaaS
SaaS platform with per-customer TEE isolation:
> Note: Deploy isolated TEE instances per customer using Phala Cloud Dashboard or automate via API/CLI.
# Per-customer TEE deployment configuration
# Create separate CVM deployment for each customer
# Basic Tier Customer
app_name: customer-acme-basic
image: your-company/analytics-platform:latest
tee_type: intel-tdx
resources:
cpu_cores: 4
memory_gb: 16
storage_gb: 100
attestation:
public: true
continuous_verification: true
---
# Professional Tier Customer
app_name: customer-acme-pro
image: your-company/analytics-platform:latest
tee_type: intel-tdx
resources:
cpu_cores: 16
memory_gb: 64
storage_gb: 500
attestation:
public: true
continuous_verification: true
---
# Enterprise Tier Customer (with GPU TEE)
app_name: customer-acme-enterprise
image: your-company/analytics-platform:latest
tee_type: h100_tee
resources:
cpu_cores: 32
memory_gb: 128
gpu_count: 1 # H100 GPU TEE
storage_gb: 1000
attestation:
public: true
continuous_verification: truePricing and Cost Optimization
Pricing Model
Phala Confidential AI Cloud pricing — usage-based for GPU TEE and CPU TEE.
| Resource | Hourly Rate | Monthly Reserved | TEE Premium | Use Case |
| H100 TEE | $4.50/GPU/hour | $2,700/GPU/month | 10% | Confidential LLM inference, training |
| H200 TEE | $6.00/GPU/hour | $3,600/GPU/month | 12% | Large LLM inference |
| TDX | $0.12/vCPU/hour + $0.02/GB RAM/hour | 50% discount | 15% | General confidential computing |
| SEV-SNP | $0.11/vCPU/hour + $0.018/GB RAM/hour | 50% discount | 12% | AMD-optimized workloads |
Cost Optimization Strategies
Reducing Phala Cloud costs:
- Reserved Capacity: Commit to 1-year for base load, saving 40%
- Autoscaling Tuning: Scale down aggressively during low traffic, saving 25-35%
- Model Optimization: Quantization, pruning, distillation, saving 50%
- Request Batching: Increase throughput by 30%
- Multi-Region Optimization: Deploy in cheaper regions, saving 15%
Compliance and Enterprise Features
Enterprise-Grade Compliance
Phala Cloud compliance certifications:
| Completed Certifications | In Progress (2025) | TEE-Specific Advantages |
| SOC 2 Type II | FedRAMP High | Attestation simplifies audits |
| ISO 27001 | ISO 27017/27018 | Hardware-enforced controls |
| GDPR Compliant | ITAR Registration | Public verification |
| HIPAA Eligible | Regional certifications | Continuous compliance monitoring |
| PCI-DSS Level 1 |
Enterprise features:
- Dedicated Infrastructure: Customer-specific hardware pools
- Private Cloud Deployment: Phala Cloud in your datacenter
- Custom Attestation Policies: Organization-specific TEE requirements
- Advanced Key Management: Bring-your-own-key (BYOK)
- Dedicated Support: 99.95% uptime guarantee
- Custom SLAs: Up to 99.99% uptime
Roadmap and Future Developments
2025-2027 Platform Roadmap
| Year | Key Developments |
| 2025 | H200 TEE general availability, Multi-region expansion, Dstack SDK 1.0 |
| 2026 | Multi-modal confidential AI, Confidential databases, Zero-knowledge compute |
| 2027 | TEE becomes default, Industry consolidation, Regulatory requirement |
Conclusion
Why Phala Confidential Cloud?
Key differentiators:
| Technology Leadership | Business Benefits | Ecosystem |
| First-to-market GPU TEE | Faster time-to-market | Open source: Dstack SDK |
| Dstack SDK | Sales enablement | Standards: CCC member |
| Trust Center | Compliance simplified | Integrations: Hugging Face |
| Performance | Competitive moat | Community: Active developer community |
Who should use Phala Cloud:
- Healthcare AI companies - HIPAA-compliant AI without compromise
- Financial services - Protect proprietary models and customer data
- Enterprise SaaS - Privacy-preserving multi-tenant platforms
- Government contractors - FedRAMP-ready confidential computing
- AI startups - Privacy as competitive differentiator
- Researchers - Collaborative AI without data sharing
Getting Started
Deploy confidential AI in minutes using Phala Cloud's pre-configured LLMs:
import openai
# Use Phala's Confidential AI API (OpenAI-compatible)
client = openai.OpenAI(
api_key="your-phala-api-key", # Get from https://cloud.phala.network
base_url="https://api.phala.network/v1"
)
# Your queries run in GPU TEE - fully confidential
response = client.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
messages=[{"role": "user", "content": "Hello from confidential AI!"}]
)
print(f"Response: {response.choices[0].message.content}")
# Get attestation proof: https://proof.t16z.com/reports/<checksum>For custom deployments: Use Phala Cloud Dashboard to deploy Docker applications with TEE protection.
Resources:
- Documentation: Phala Cloud Documentation
- Dstack SDK: Dstack SDK
- Trust Center: Trust Center
- Community: Phala Community
- Enterprise Contact: [email protected]
What’s Next in Your Learning Journey?
Recommended reading order:
- New to confidential computing?
- Start: What is Confidential Computing
- Then: Getting Started
- Ready to deploy?
- Implementation: Deploying Confidential AI
- Best practices: Production Best Practices
- Industry-specific:
- Healthcare: Confidential AI in Healthcare
- Finance: Confidential Computing in Finance
- Government: Government Confidential Computing
- Advanced topics:
- LLMs: Confidential LLMs
- Edge: Confidential Edge AI
Start building today:
Deploy on Phala Cloud - Free tier available, production-ready infrastructure.