NVIDIA H200 GPU TEE

H200 slots for private models.

Name: NVIDIA H200 GPU TEE
Brand: Phala
Price: 50.37 USD
Availability: InStock
Rating: 4.8 (127 reviews)

High-memory H200 capacity for larger LLM serving, tuning, and protected batch jobs.

H200 TEE capacity with 141GB HBM3e memory, confidential GPU mode, CVM runtime, and TEE operations.

Trial nowQuote price

Capacity cell

Slot ready

hardware proof rail

This GPU moves through the same verifiable GPU proof path.

GPU TEE

H200

GPU TEE

141GB HBM3e

GPU TEE

TEE proof

Memory

141GB HBM3e

Bandwidth

4.8 TB/s

Region

US-West / India

Scale

1-8 GPUs

GPU buyer details

More than a GPU quote.

GPU sellers usually stop at price and availability. This page makes the extra TEE requirements visible: runtime boundary, confidential GPU mode, attestation, and operations.

Best fit

GPU workload shape

01Large LLM serving

02Fine-tuning windows

03High-memory batch jobs

Access

1-8 GPU slots

Commitment

Reserved H200 capacity

Enterprise

Multi-node private model clusters

TEE readiness checklist

Capacity

GPU memory, bandwidth, region, and scale are visible before the sales call.

Cloud path

Run through confidential VMs, bare metal paths, or enterprise deployments.

TEE readiness

Intel TDX, NVIDIA confidential computing, drivers, BIOS, and verifier readiness are handled by Phala.

Buying motion

Start with a 24-hour trial, reserve a slot, or quote a dedicated cluster.

GPU technical profile

Use H200 when memory is the bottleneck.

H200 is the better fit for larger LLM serving, long-context inference, protected batch jobs, and tuning runs that need more model or KV-cache memory without immediately moving to a Blackwell cluster.

See buying paths Ask for cluster fit

Memory

141GB HBM3e gives larger models and longer contexts more room before sharding.

Bandwidth

4.8TB/s memory bandwidth helps keep high-memory inference and tuning jobs fed.

Scale

Use 1-8 GPU slots for reserved runs, benchmark windows, or production private model serving.

TEE layer

Phala pairs GPU capacity with CVM isolation, dual evidence, and TEE-aware operations.

Performance metrics for private AI GPU planning

H100

H200

relative index

H100

1.9x

H200

LLM inference

model + KV cache

80GB

H100

141GB

H200

GPU memory

feed batches

3.35TB/s

H100

4.8TB/s

H200

Memory bandwidth

NVIDIA H100

80GB HBM3

NVIDIA H200

141GB HBM3e

Performance comparison

NVIDIA H200 vs NVIDIA H100

H100 is the practical starting point for confidential GPU trials. H200 gives larger models and longer contexts more room before sharding.

Use this comparison to decide whether the workload needs a short H100 validation window or reserved H200 high-memory capacity.

GPU buying paths

Pick the buying path for the job.

Trial a single machine, quote a reserved slot, or move into a dedicated cluster when the workload becomes production infrastructure.

01 / On-demand

Trial a confidential GPU in 24 hours.

Short test windows for builders validating private inference, model serving, or proof generation.

Trial now

02 / Slot

Reserve capacity before the next run.

Predictable GPU access for sustained training, fine-tuning, and benchmark windows.

Quote price

03 / Enterprise

Dedicated clusters with TEE operations.

Custom H100, H200, or B300 deals with TEE-aware infrastructure support and deployment planning.

Talk to sales

Proof path

This GPU is useful because it is verifiable.

The GPU is not sold as raw hardware. It is delivered through a confidential VM path with GPU confidential computing and dual attestation built in.

                                                                                
                                                                                
                                                                                
                                                                                
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++

cvm-enclave · 80×24 · 24fpsdensity: .:-=+*#%@

CVM runtime

Docker workloads run inside an Intel TDX confidential VM with GPU passthrough. The runtime is sealed against the operator and measured by firmware before the workload starts.

                                                                                
                                                                                
      @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      
      @                                                                  @      
      @  @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @                                                                  @      
      @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

gpu-cc · 80×22 · 24fpsdensity: .:-=+*#%@

GPU CC mode

NVIDIA Confidential Computing seals model weights, activations, and KV cache inside protected GPU memory. The GPU enforces compute isolation alongside the CPU TEE.

                                                                                
                                                                                
                                                                                
     @@@@@@@@@@@@                                                               
     @=--::--=++@+-:.                                                           
     @=--::--=++@#*+-::....                                                     
     @@@@@@@@@@@@        ...........                                            
                                   ..........                  @@@@@@@@@@@@@    
                                            ...........        @======++***@    
                                                     ..........@=====++****@    
                                                     ..........@====++*****@    
                                       .::::...........        @===++******@    
                                  .:-*%@@%*-:.                 @@@@@@@@@@@@@    
     @@@@@@@@@@@@        ...........::::.                                       
     @:::-==++++@..........                                                     
     @:::-==++++@                                                               
     @@@@@@@@@@@@

dual-attestation · 80×20 · 24fpsdensity: .:-=+*#%@

Dual attestation

Intel TDX and NVIDIA each emit a signed quote. Phala collects both and exposes them through one verifier so the CVM and the GPU prove themselves together.

Other confidential GPUs

Compare the next capacity path.

Use the same marketplace model across H100, H200, and B300: capacity, price, region, and proof state stay visible.

Trial ready

NVIDIA H100

Proven confidential inference and fine-tuning capacity.

$3.08/GPU/hrview details

Quote now

NVIDIA B300

Blackwell Ultra confidential capacity for frontier inference.

$6.50/GPU/hrview details

Private execution. Verifiable results.

Newsletter