NVIDIA H200 GPU TEE
High-memory H200 capacity for larger LLM serving, tuning, and protected batch jobs.
H200 TEE capacity with 141GB HBM3e memory, confidential GPU mode, CVM runtime, and TEE operations.
Capacity cell
Slot ready


hardware proof rail
GPU TEE
H200
GPU TEE
141GB HBM3e
GPU TEE
TEE proof
Memory
141GB HBM3e
Bandwidth
4.8 TB/s
Region
US-West / India
Scale
1-8 GPUs
GPU buyer details
GPU sellers usually stop at price and availability. This page makes the extra TEE requirements visible: runtime boundary, confidential GPU mode, attestation, and operations.
Best fit
Access
1-8 GPU slots
Commitment
Reserved H200 capacity
Enterprise
Multi-node private model clusters
TEE readiness checklist
01
Capacity
GPU memory, bandwidth, region, and scale are visible before the sales call.
02
Cloud path
Run through confidential VMs, bare metal paths, or enterprise deployments.
03
TEE readiness
Intel TDX, NVIDIA confidential computing, drivers, BIOS, and verifier readiness are handled by Phala.
04
Buying motion
Start with a 24-hour trial, reserve a slot, or quote a dedicated cluster.
GPU technical profile
H200 is the better fit for larger LLM serving, long-context inference, protected batch jobs, and tuning runs that need more model or KV-cache memory without immediately moving to a Blackwell cluster.
Memory
141GB HBM3e gives larger models and longer contexts more room before sharding.
Bandwidth
4.8TB/s memory bandwidth helps keep high-memory inference and tuning jobs fed.
Scale
Use 1-8 GPU slots for reserved runs, benchmark windows, or production private model serving.
TEE layer
Phala pairs GPU capacity with CVM isolation, dual evidence, and TEE-aware operations.
relative index
1x
1.9x
LLM inference
model + KV cache
80GB
141GB
GPU memory
feed batches
3.35TB/s
4.8TB/s
Memory bandwidth

NVIDIA H100
80GB HBM3

NVIDIA H200
141GB HBM3e
Performance comparison
H100 is the practical starting point for confidential GPU trials. H200 gives larger models and longer contexts more room before sharding.
Use this comparison to decide whether the workload needs a short H100 validation window or reserved H200 high-memory capacity.
GPU buying paths
Trial a single machine, quote a reserved slot, or move into a dedicated cluster when the workload becomes production infrastructure.
01 / On-demand
Short test windows for builders validating private inference, model serving, or proof generation.
02 / Slot
Predictable GPU access for sustained training, fine-tuning, and benchmark windows.
03 / Enterprise
Custom H100, H200, or B300 deals with TEE-aware infrastructure support and deployment planning.
Proof path
The GPU is not sold as raw hardware. It is delivered through a confidential VM path with GPU confidential computing and dual attestation built in.
01
Docker workloads run inside an Intel TDX confidential VM with GPU passthrough. The runtime is sealed against the operator and measured by firmware before the workload starts.
02
NVIDIA Confidential Computing seals model weights, activations, and KV cache inside protected GPU memory. The GPU enforces compute isolation alongside the CPU TEE.
03
Intel TDX and NVIDIA each emit a signed quote. Phala collects both and exposes them through one verifier so the CVM and the GPU prove themselves together.
Other confidential GPUs
Use the same marketplace model across H100, H200, and B300: capacity, price, region, and proof state stay visible.