NVIDIA H200 GPU TEE

H200 slots for private models.

High-memory H200 capacity for larger LLM serving, tuning, and protected batch jobs.

H200 TEE capacity with 141GB HBM3e memory, confidential GPU mode, CVM runtime, and TEE operations.

Capacity cell

Slot ready
NVIDIA H200 chipNVIDIA H200 chipNVIDIA H200 chip

hardware proof rail

This GPU moves through the same verifiable GPU proof path.

GPU TEE

H200

GPU TEE

141GB HBM3e

GPU TEE

TEE proof

Memory

141GB HBM3e

Bandwidth

4.8 TB/s

Region

US-West / India

Scale

1-8 GPUs

GPU buyer details

More than a GPU quote.

GPU sellers usually stop at price and availability. This page makes the extra TEE requirements visible: runtime boundary, confidential GPU mode, attestation, and operations.

Best fit

GPU workload shape

01Large LLM serving
02Fine-tuning windows
03High-memory batch jobs

Access

1-8 GPU slots

Commitment

Reserved H200 capacity

Enterprise

Multi-node private model clusters

TEE readiness checklist

01

Capacity

GPU memory, bandwidth, region, and scale are visible before the sales call.

02

Cloud path

Run through confidential VMs, bare metal paths, or enterprise deployments.

03

TEE readiness

Intel TDX, NVIDIA confidential computing, drivers, BIOS, and verifier readiness are handled by Phala.

04

Buying motion

Start with a 24-hour trial, reserve a slot, or quote a dedicated cluster.

GPU technical profile

Use H200 when memory is the bottleneck.

H200 is the better fit for larger LLM serving, long-context inference, protected batch jobs, and tuning runs that need more model or KV-cache memory without immediately moving to a Blackwell cluster.

Memory

141GB HBM3e gives larger models and longer contexts more room before sharding.

Bandwidth

4.8TB/s memory bandwidth helps keep high-memory inference and tuning jobs fed.

Scale

Use 1-8 GPU slots for reserved runs, benchmark windows, or production private model serving.

TEE layer

Phala pairs GPU capacity with CVM isolation, dual evidence, and TEE-aware operations.

Performance metrics for private AI GPU planning

relative index

1x

H100

1.9x

H200

LLM inference

model + KV cache

80GB

H100

141GB

H200

GPU memory

feed batches

3.35TB/s

H100

4.8TB/s

H200

Memory bandwidth

NVIDIA H100 chip

NVIDIA H100

80GB HBM3

NVIDIA H200 chip

NVIDIA H200

141GB HBM3e

Performance comparison

NVIDIA H200 vs NVIDIA H100

H100 is the practical starting point for confidential GPU trials. H200 gives larger models and longer contexts more room before sharding.

Use this comparison to decide whether the workload needs a short H100 validation window or reserved H200 high-memory capacity.

GPU buying paths

Pick the buying path for the job.

Trial a single machine, quote a reserved slot, or move into a dedicated cluster when the workload becomes production infrastructure.

01 / On-demand

Trial a confidential GPU in 24 hours.

Short test windows for builders validating private inference, model serving, or proof generation.

02 / Slot

Reserve capacity before the next run.

Predictable GPU access for sustained training, fine-tuning, and benchmark windows.

03 / Enterprise

Dedicated clusters with TEE operations.

Custom H100, H200, or B300 deals with TEE-aware infrastructure support and deployment planning.

Proof path

This GPU is useful because it is verifiable.

The GPU is not sold as raw hardware. It is delivered through a confidential VM path with GPU confidential computing and dual attestation built in.

cvm-enclave · 80×24 · 24fpsdensity: .:-=+*#%@

01

CVM runtime

Docker workloads run inside an Intel TDX confidential VM with GPU passthrough. The runtime is sealed against the operator and measured by firmware before the workload starts.

gpu-cc · 80×22 · 24fpsdensity: .:-=+*#%@

02

GPU CC mode

NVIDIA Confidential Computing seals model weights, activations, and KV cache inside protected GPU memory. The GPU enforces compute isolation alongside the CPU TEE.

dual-attestation · 80×20 · 24fpsdensity: .:-=+*#%@

03

Dual attestation

Intel TDX and NVIDIA each emit a signed quote. Phala collects both and exposes them through one verifier so the CVM and the GPU prove themselves together.

Other confidential GPUs

Compare the next capacity path.

Use the same marketplace model across H100, H200, and B300: capacity, price, region, and proof state stay visible.

H200 GPU TEE — 141GB HBM3e Confidential AI | Phala