Confidential AI Models

Private LLMs.
Verified results.

Name: Confidential AI Models
Brand: Phala
Price: 50.37 USD
Availability: InStock
Rating: 4.8 (127 reviews)

Frontier inference without exposing prompts, tools, or memory.

OpenAI-compatible APIs run inside hardware-backed TEEs and return proof of the runtime that handled the request.

Call a private LLM View models Talk to sales

AI calls carry more than prompt.

TEE boundary

Phala private LLM

same SDK

TEE endpoint

hardware receipt

Private LLM catalog

Frontier models with private runtime.

OpenAI-compatible models with hardware-backed privacy and verification. Keep your SDK flow, change the endpoint, and copy the real call when you need it.

encrypted

Z.ai: GLM 5.2

1.0M context

$1.40/M input

Check detail

encrypted

Qwen: Qwen3.6 27B

262K context

$0.32/M input

Check detail

encrypted

DeepSeek: DeepSeek V4 Flash

1.0M context

$0.20/M input

Check detail

encrypted

Qwen: Qwen3.5-122B-A10B

262K context

$0.46/M input

Check detail

encrypted

Qwen: Qwen3 32B

41K context

$0.12/M input

Check detail

encrypted

Google: Gemma 4 31B

262K context

$0.15/M input

Check detail

encrypted

Qwen: Qwen3.6 35B A3B

262K context

$0.20/M input

Check detail

encrypted

DeepSeek: DeepSeek V4 Pro

800K context

$1.50/M input

Check detail

encrypted

Phala: Gemma-4 26B-A4B Uncensored (Heretic)

66K context

$0.15/M input

Check detail

encrypted

Phala: Qwen3.6 35B-A3B Uncensored (Aggressive)

131K context

$0.30/M input

Check detail

encrypted

MoonshotAI: Kimi K2.6

262K context

$1.09/M input

Check detail

encrypted

Z.ai: GLM 5.1

203K context

$1.21/M input

Check detail

Model requests are routed through confidential AI providers with TEE support.

Check all

Integrate in minutes

Same SDK, Change Endpoint, Verify E2EE.

Keep your OpenAI-compatible client. Point it at the private endpoint, choose a Phala model slug, and read the proof when the output needs an audit trail.

selected proof

Private LLM Gateway

The OpenAI-compatible endpoint terminates inside the verified gateway boundary.

reporttls_endpointreceiptgateway_app_idstatusverified

app_idlinked

endpointlinked

policylinked

app_certlinked

drag · zoom · click node

View Trust Center Read model docs

AI solution paths

Use private models where AI touches secrets.

The private model endpoint is the first entry point. The same privacy primitive extends to agents, data workflows, and training.

LLM API

Private AI inference

Serve OpenAI-compatible model calls where prompts, outputs, and customer context need encrypted-in-use protection.

Open solution

encrypted

DeepSeek V3.1

128K

$0.27/M input

encrypted

Qwen3 Coder

256K

$0.40/M input

encrypted

Llama 3.3 70B

128K

$0.15/M input

encrypted

GPT OSS 120B

128K

$0.10/M input

encrypted

Claude Sonnet 4.5

200K

$3.00/M input

encrypted

Gemini 2.5 Pro

$1.25/M input

Agents

Private AI agents

Run agents with keys, tools, memory, and actions inside a verified runtime instead of a visible automation cloud.

Open solution

Training

Private model training

Adapt models on proprietary data while keeping datasets, gradients, checkpoints, and evaluation traces inside the boundary.

Open solution

private training run

Observe without exposing weights.

H100 CC

dataset

sealed

fine-tune

running

eval

private

checkpoint

verified

loss curve

proof attached

attestation.json

Data

Private AI data

Move models to sensitive records and return approved outputs without exposing raw data to the model operator.

Open solution

source

EHR data

source

Customer records

source

Internal docs

TEE clean room

query without raw access

approved output

aggregate only

no row exportproof linked

Questions

What teams ask before they switch.

Private LLMs are not just another endpoint. They are a deployment choice between SaaS convenience and self-operated AI infrastructure.

How is this different from a normal LLM API?

A normal LLM API asks you to trust the provider boundary. Phala runs the model call inside hardware-backed TEEs and can attach runtime proof showing what protected the request.

How is this different from running models on-prem?

On-prem gives control, but you operate GPUs, model serving, upgrades, and capacity. Phala keeps the API workflow while adding private execution and verifiable runtime state.

How difficult is it to integrate private LLMs into my existing app?

Use the OpenAI-compatible API shape: change the base URL, select a private model slug, and keep your existing SDK or agent framework.

What model types are available?

The catalog includes coding, reasoning, general chat, and open-weight model families from providers such as DeepSeek, Qwen, Meta, Mistral, Google, and OpenAI OSS.

How can customers verify that data was protected?

The Trust Center turns attestation reports into an inspectable view of hardware, source, runtime, and network verification state.

When should I use a dedicated private stack?

Use a dedicated stack when you need custom models, reserved GPUs, customer-specific deployments, or a stronger compliance and audit boundary than shared inference.

Start building

Build AI you can prove.

Deploy private workloads, verify execution, and scale from models to GPU jobs.

Start building Open dashboard Talk to sales

Private execution. Verifiable results.

Newsletter

Private LLMs.Verified results.

AI calls carry more than prompt.

Phala private LLM

Traditional AI

Confidential AI

Frontier models with private runtime.

Same SDK, Change Endpoint, Verify E2EE.

Private LLM Gateway

Private LLM Gateway

Gateway Code

Gateway Attestation

Event Logs for RTMRs

Key Management Service

KMS Code

Model Runtime

Response Receipt

Use private models where AI touches secrets.

Private AI inference

DeepSeek V3.1

Qwen3 Coder

Llama 3.3 70B

GPT OSS 120B

Claude Sonnet 4.5

Gemini 2.5 Pro

Private AI agents

Private model training

Observe without exposing weights.

dataset

fine-tune

eval

checkpoint

Private AI data

query without raw access

aggregate only

What teams ask before they switch.

How is this different from a normal LLM API?

How is this different from running models on-prem?

How difficult is it to integrate private LLMs into my existing app?

What model types are available?

How can customers verify that data was protected?

When should I use a dedicated private stack?

Build AI you can prove.

Private LLMs.
Verified results.