Trial ready
NVIDIA H100

Proven confidential inference and fine-tuning capacity.
Memory
80GB HBM3
Bandwidth
3.35 TB/s
Region
US-West
Scale
1-2 GPUs
On-demand
$3.08/GPU/hr
24h minimum
Slot
$2.38/GPU/hr
reserved
GPU TEE Marketplace
H100, H200, and B300 capacity with CVMs, dual attestation, and TEE-aware operations.
Trial a machine for 24 hours, reserve a slot, or quote dedicated clusters. Phala handles the hard part: confidential GPUs, Intel TDX runtime, NVIDIA attestation, and the DevOps required to keep it working.
Confidential GPU cloud



hardware proof rail
GPU TEE
H100
GPU TEE
H200
GPU TEE
B300
Trial
24h minimum
Reserve
slots and clusters
Verify
CVM + GPU evidence
Marketplace inventory
Pick a GPU for a 24-hour trial, reserve a slot for sustained jobs, or quote a dedicated cluster. Every path starts from TEE-ready infrastructure instead of a raw GPU box.
Trial ready

Proven confidential inference and fine-tuning capacity.
Memory
80GB HBM3
Bandwidth
3.35 TB/s
Region
US-West
Scale
1-2 GPUs
On-demand
$3.08/GPU/hr
24h minimum
Slot
$2.38/GPU/hr
reserved
Slot ready

High-memory runtime for larger private model jobs.
Memory
141GB HBM3e
Bandwidth
4.8 TB/s
Region
US-West / India
Scale
1-8 GPUs
On-demand
$4.80/GPU/hr
24h minimum
Slot
$3.20/GPU/hr
reserved
Quote now

Blackwell Ultra confidential capacity for frontier inference.
Memory
288GB HBM3e
Bandwidth
8 TB/s
Region
US-East / US-West
Scale
1-8 GPUs
1-month
$6.50/GPU/hr
30d minimum
Slot
$5.60/GPU/hr
reserved
Prices include Intel TDX + NVIDIA confidential computing readiness. Volume and enterprise pricing are quoted by workload.
relative index
1x
1.9x
3.2x
LLM inference
model + KV cache
80GB
141GB
288GB
GPU memory
feed batches
3.35TB/s
4.8TB/s
8TB/s
Memory bandwidth

NVIDIA H100
80GB HBM3

NVIDIA H200
141GB HBM3e

NVIDIA B300
288GB HBM3e
GPU comparison
Compare the capacity shape before the quote. H100 is the fast trial path, H200 adds memory headroom, and B300 is the Blackwell Ultra path for frontier inference and dedicated clusters.
Exact throughput depends on model, batch size, precision, and runtime. Phala quotes the GPU together with the confidential VM path, GPU CC readiness, and attestation operations.
GPU cloud mockup
The marketplace view should make the buying motion obvious: trial, reserve, then scale into a dedicated cluster with TEE readiness attached.

H100
80GB HBM3from
$3.08/hr

H200
141GB HBM3efrom
$4.80/hr

B300
288GB HBM3efrom
$6.50/hr
verified
CVM runtime
verified
GPU CC mode
verified
Dual attestation
GPU TEE proof path
GPU isolation is only useful when the entire path — runtime, GPU mode, and evidence collection — is verifiable end-to-end. Phala delivers all three together.
01
Docker workloads run inside an Intel TDX confidential VM with GPU passthrough. The runtime is sealed against the operator and measured by firmware before the workload starts.
02
NVIDIA Confidential Computing seals model weights, activations, and KV cache inside protected GPU memory. The GPU enforces compute isolation alongside the CPU TEE.
03
Intel TDX and NVIDIA each emit a signed quote. Phala collects both and exposes them through one verifier so the CVM and the GPU prove themselves together.
Buying paths
The marketplace is structured around how AI builders actually buy GPUs: test quickly, reserve capacity when a workload proves out, then move to enterprise deals when the cluster becomes production-critical.
01 / On-demand
Short test windows for builders validating private inference, model serving, or proof generation.
02 / Slot
Predictable GPU access for sustained training, fine-tuning, and benchmark windows.
03 / Enterprise
Custom H100, H200, or B300 deals with TEE-aware infrastructure support and deployment planning.
AI solution paths
The private model endpoint is the first entry point. The same privacy primitive extends to agents, data workflows, and training.
Serve OpenAI-compatible model calls where prompts, outputs, and customer context need encrypted-in-use protection.
128K
$0.27/M input
256K
$0.40/M input
128K
$0.15/M input
128K
$0.10/M input
200K
$3.00/M input
1M
$1.25/M input
Run agents with keys, tools, memory, and actions inside a verified runtime instead of a visible automation cloud.
Adapt models on proprietary data while keeping datasets, gradients, checkpoints, and evaluation traces inside the boundary.
private training run
01
sealed
02
running
03
private
04
verified
loss curve
proof attached
attestation.json
Move models to sensitive records and return approved outputs without exposing raw data to the model operator.
source
EHR data
source
Customer records
source
Internal docs
TEE clean room
approved output