GPU TEE Marketplace

TEE-ready GPUs for AI builders.

Name: GPU TEE - Confidential GPU Computing
Brand: Phala
Price: 50.37 USD
Availability: InStock
Rating: 4.8 (127 reviews)

H100, H200, and B300 capacity with CVMs, dual attestation, and TEE-aware operations.

Trial a machine for 24 hours, reserve a slot, or quote dedicated clusters. Phala handles the hard part: confidential GPUs, Intel TDX runtime, NVIDIA attestation, and the DevOps required to keep it working.

Trial nowQuote price

Confidential GPU cloud

Capacity first. Proof after the workload runs.

H100 / H200 / B300

hardware proof rail

H100, H200, and B300 move through one verifiable GPU path.

GPU TEE

H100

GPU TEE

H200

GPU TEE

B300

Trial

24h minimum

Reserve

slots and clusters

Verify

CVM + GPU evidence

Marketplace inventory

Capacity with proof built in.

Pick a GPU for a 24-hour trial, reserve a slot for sustained jobs, or quote a dedicated cluster. Every path starts from TEE-ready infrastructure instead of a raw GPU box.

Trial ready

NVIDIA H100

Proven confidential inference and fine-tuning capacity.

Memory

80GB HBM3

Bandwidth

3.35 TB/s

Region

US-West

Scale

1-2 GPUs

On-demand

$3.08/GPU/hr

24h minimum

Slot

$2.38/GPU/hr

reserved

Trial now Details

Slot ready

NVIDIA H200

High-memory runtime for larger private model jobs.

Memory

141GB HBM3e

Bandwidth

4.8 TB/s

Region

US-West / India

Scale

1-8 GPUs

On-demand

$4.80/GPU/hr

24h minimum

Slot

$3.20/GPU/hr

reserved

Trial now Details

Quote now

NVIDIA B300

Blackwell Ultra confidential capacity for frontier inference.

Memory

288GB HBM3e

Bandwidth

8 TB/s

Region

US-East / US-West

Scale

1-8 GPUs

1-month

$6.50/GPU/hr

30d minimum

Slot

$5.60/GPU/hr

reserved

Trial now Details

Prices include Intel TDX + NVIDIA confidential computing readiness. Volume and enterprise pricing are quoted by workload.

Quote price

Performance metrics for private AI GPU planning

H100

H200

B300

relative index

H100

1.9x

H200

3.2x

B300

LLM inference

model + KV cache

80GB

H100

141GB

H200

288GB

B300

GPU memory

feed batches

3.35TB/s

H100

4.8TB/s

H200

8TB/s

B300

Memory bandwidth

NVIDIA H100

80GB HBM3

NVIDIA H200

141GB HBM3e

NVIDIA B300

288GB HBM3e

GPU comparison

H100 vs H200 vs B300

Compare the capacity shape before the quote. H100 is the fast trial path, H200 adds memory headroom, and B300 is the Blackwell Ultra path for frontier inference and dedicated clusters.

Exact throughput depends on model, batch size, precision, and runtime. Phala quotes the GPU together with the confidential VM path, GPU CC readiness, and attestation operations.

GPU cloud mockup

Capacity lanes with proof state.

The marketplace view should make the buying motion obvious: trial, reserve, then scale into a dedicated cluster with TEE readiness attached.

H100

80GB HBM3

from

$3.08/hr

H200

141GB HBM3e

from

$4.80/hr

B300

288GB HBM3e

from

$6.50/hr

verified

CVM runtime

verified

GPU CC mode

verified

Dual attestation

GPU TEE proof path

What Phala handles for the CVM path.

GPU isolation is only useful when the entire path — runtime, GPU mode, and evidence collection — is verifiable end-to-end. Phala delivers all three together.

                                                                                
                                                                                
                                                                                
                                                                                
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++                         
                         ++++++++++++++++++++++++++++++

cvm-enclave · 80×24 · 24fpsdensity: .:-=+*#%@

CVM runtime

Docker workloads run inside an Intel TDX confidential VM with GPU passthrough. The runtime is sealed against the operator and measured by firmware before the workload starts.

                                                                                
                                                                                
      @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      
      @                                                                  @      
      @  @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @@@@ @   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @  ...: :::: :... .::: :::: ...: :::: ::.. .::: :::: ...: :::: :   @      
      @                                                                  @      
      @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

gpu-cc · 80×22 · 24fpsdensity: .:-=+*#%@

GPU CC mode

NVIDIA Confidential Computing seals model weights, activations, and KV cache inside protected GPU memory. The GPU enforces compute isolation alongside the CPU TEE.

                                                                                
                                                                                
                                                                                
     @@@@@@@@@@@@                                                               
     @=--::--=++@+-:.                                                           
     @=--::--=++@#*+-::....                                                     
     @@@@@@@@@@@@        ...........                                            
                                   ..........                  @@@@@@@@@@@@@    
                                            ...........        @======++***@    
                                                     ..........@=====++****@    
                                                     ..........@====++*****@    
                                       .::::...........        @===++******@    
                                  .:-*%@@%*-:.                 @@@@@@@@@@@@@    
     @@@@@@@@@@@@        ...........::::.                                       
     @:::-==++++@..........                                                     
     @:::-==++++@                                                               
     @@@@@@@@@@@@

dual-attestation · 80×20 · 24fpsdensity: .:-=+*#%@

Dual attestation

Intel TDX and NVIDIA each emit a signed quote. Phala collects both and exposes them through one verifier so the CVM and the GPU prove themselves together.

Buying paths

Start small. Reserve when it works.

The marketplace is structured around how AI builders actually buy GPUs: test quickly, reserve capacity when a workload proves out, then move to enterprise deals when the cluster becomes production-critical.

01 / On-demand

Trial a confidential GPU in 24 hours.

Short test windows for builders validating private inference, model serving, or proof generation.

Trial now

02 / Slot

Reserve capacity before the next run.

Predictable GPU access for sustained training, fine-tuning, and benchmark windows.

Quote price

03 / Enterprise

Dedicated clusters with TEE operations.

Custom H100, H200, or B300 deals with TEE-aware infrastructure support and deployment planning.

Talk to sales

AI solution paths

Use GPU TEE where AI touches secrets.

GPU capacity is one part of the privacy boundary. The same confidential compute path supports private inference, agents, training, and data workflows.

LLM API

Private AI inference

Serve OpenAI-compatible model calls where prompts, outputs, and customer context need encrypted-in-use protection.

Open solution

Agents

Private AI agents

Run agents with keys, tools, memory, and actions inside a verified runtime instead of a visible automation cloud.

Open solution

Training

Private model training

Adapt models on proprietary data while keeping datasets, gradients, checkpoints, and evaluation traces inside the boundary.

Open solution

Data

Private AI data

Move models to sensitive records and return approved outputs without exposing raw data to the model operator.

Open solution

Private execution. Verifiable results.

Newsletter

GPU TEE Cloud — H100/H200/B300 Confidential AI | Phala