隐私推理

在不暴露提示词的情况下进行推理。

兼容 OpenAI。签名收据。天然无日志。

你的想法只属于你

Private LLM catalog

Frontier models with private runtime.

OpenAI-compatible models with hardware-backed privacy and verification. Keep your SDK flow, change the endpoint, and copy the real call when you need it.

加密的

Qwen: Qwen3.5-122B-A10B

262K 上下文

$0.46/M input

查看详情
加密的

Qwen: Qwen3 32B

41K 上下文

$0.12/M input

查看详情
加密的

Google: Gemma 4 31B

262K 上下文

$0.15/M input

查看详情
加密的

Qwen: Qwen3.6 35B A3B

262K 上下文

$0.20/M input

查看详情
加密的

DeepSeek: DeepSeek V4 Pro

800K 上下文

$1.50/M input

查看详情
加密的

Phala: Gemma-4 26B-A4B Uncensored (Heretic)

66K 上下文

$0.15/M input

查看详情
加密的

Phala: Qwen3.6 35B-A3B Uncensored (Aggressive)

131K 上下文

$0.30/M input

查看详情
加密的

MoonshotAI: Kimi K2.6

262K 上下文

$1.09/M input

查看详情
加密的

Z.ai: GLM 5.1

203K 上下文

$1.21/M input

查看详情
加密的

Qwen: Qwen3.5-27B

262K 上下文

$0.30/M input

查看详情
加密的

Qwen: Qwen3.5 397B A17B

262K 上下文

$0.55/M input

查看详情
加密的

MiniMax: MiniMax M2.5

197K 上下文

$0.20/M input

查看详情
Model requests are routed through confidential AI providers with TEE support.
Check all

Private inference, by construction.

What you say to the model stays between your client and an attested CVM. Three primitives — encryption, TEE, no-logs — make that a property of the build, not a promise.

End-to-End Encryption

  • AES-GCM ciphertext on the wire, both hops
  • RA-TLS terminates inside the CVM, not at a load balancer
  • No plaintext intermediary on the host
工作原理

逐步查看单个请求的端到端流程。

关闭 dstack,看看究竟失去了哪项保障。

dstack 上的隐私推理

进入一组已证明的模型 CVM 的双跳 RA-TLS —— 可验证,天然无日志

1
步骤 1 / 5

发送任何字节前先验证构建

客户端 SDK 获取每个候选 CVM 的 TDX quote,并在本地运行 dcap-qvl——确认构建与 DstackApp.sol 中的无日志条目匹配。信任决策在客户端完成;Phala 不需要为自己背书。

With dstack: 用户持有信任根,锚定在 Intel 的 TDX 硬件签名之上。

同一 SDK。同一端点。默认机密。

cURL · 即插即用

Hit api.redpill.ai/v1/chat/completions with the OpenAI request shape. Receipt headers come back on every response — even from curl.

cURL
$ curl https://api.redpill.ai \/v1/chat/completions \-H "Authorization: …" \-d '{"model":…}'x-phala-receipt-sig: 0x9c..x-phala-compose-hash: 0xa1..
PYTHON
from openai import OpenAIc = OpenAI(base_url="…redpill.ai/v1",api_key=RP_KEY)r = c.chat.completions.create(…)

OpenAI Python SDK

`base_url="https://api.redpill.ai/v1"` and you’re done. Existing code keeps working; receipts attach to the response object.

一个统一的验证器

Whether the model runs on Intel TDX + H100 or AMD SEV + B300, the receipt format is identical. One verification path covers your whole TEE-LLM fleet — even when you mix providers.

UNIFIED PROOF
unified verifierall match
phalaLlama 3.1
near aiDeepSeek V3
tinfoilQwen2.5
chutesMistral
one format · any provider
OPENROUTER
openrouter · phala2026-06-01
3Btokens / day
Llama · open$0.40 / M
Llama · phala$0.40 / M
DeepSeek · open$0.27 / M
DeepSeek · phala$0.27 / M

隐私不加价

Confidential routes through Phala on OpenRouter price the same as the open route. Privacy is no longer a procurement line item — just a header you opted into.

two-hop RA-TLS · X.509 with TDX-quote extension

tunneled · no plaintext intermediary

hop 01 · client → gateway

CN=phala-gatewayTDX-quote ext (1.3.6.1.4.1…)

hop 02 · gateway → model CVM

CN=vllm-llama-3.1-70bTDX+H100 quote ext
RA-TLSmTLSX.509tunneled

两跳 RA-TLS,直达模型

The first TLS hop terminates inside the dstack-gateway CVM (whose certificate carries its TDX quote). The second terminates inside the model CVM. There is no plaintext intermediary — just two confidential VMs whose X.509 certificates ARE their attestations.

response · /v1/chat/completions

200
x-phala-receipt-sig0x9c1a…f7e2x-phala-compose-hash0xa1b2…d1f3x-phala-app-idvllm-llama-3.1-70bx-phala-no-logtrue · by build
verify offlinechains to DstackApp.sol

签名回执 + 链上 compose-hash,每次响应都包含

Every response carries x-phala-receipt-sig + x-phala-compose-hash. The signature chains to the TDX root and the on-chain DstackApp.sol entry — verify offline that the build that ran is the build that was registered.

in production today · 3 live partners

Confidential inference, in production.

OpenRouter routes its enterprise tier through Phala. NEAR AI ships verifiable agent inference. OODA AI runs decentralized GPU TEE.

01enterprise · live

OpenRouter

enterprise tier · drop-in

Drop-in OpenAI-compatible endpoint with verifiable, no-log routing. The receipt is the audit trail.

18B+ tokens

no-log · verified routing

02web3 · live

NEAR AI

verifiable agent inference

Verifiable agent inference for autonomous, on-chain workflows. Every model call lands on-chain with proof.

100% receipts

on-chain verified · zk inference

03public-co · live

OODA AI

NASDAQ-listed · decentralized GPUs

Decentralized GPUs with hardware attestation guarantees. No host root, no off-band access, no policy promises.

12M tokens / day

TDX + H100 · hardware-attested

OpenAI-compatible

drop-in /v1 surface

TDX + H100/H200/B300

CPU + GPU TEE

5–15% overhead

vs bare-metal

No host root

compose-hash IS the policy

AI 解决方案路径

在 AI 触及密钥时使用隐私模型。

隐私模型端点是第一个入口点。同样的隐私原语也适用于代理、数据工作流和训练。

Agents

隐私 AI 代理

在可验证的运行时中运行代理的密钥、工具、记忆和操作,而不是放在可见的自动化云中。

打开解决方案
Training

隐私模型训练

在保持数据集、梯度、检查点和评估轨迹处于边界内的同时,基于专有数据调整模型。

打开解决方案

private training run

Observe without exposing weights.

H100 CC

01

dataset

sealed

02

fine-tune

running

03

eval

private

04

checkpoint

verified

loss curve

proof attached

attestation.json

Data

隐私 AI 数据

将模型移动到敏感记录旁,在不向模型运营方暴露原始数据的情况下返回已批准的输出。

打开解决方案

source

EHR data

source

Customer records

source

Internal docs

TEE clean room

query without raw access

approved output

aggregate only
no row exportproof linked

Deploy private inference

双跳 RA-TLS。签名回执。链上无日志。

直接接入你已在使用的 OpenAI SDK。指向 api.redpill.ai。每次响应都附带签名回执。

View docs联系销售
  • 01OpenAI-compatible base URL
  • 02TDX + H100 / H200 / Blackwell
  • 03Signed receipt per response
  • 04On-chain compose-hash registry
  • 055–15% TEE overhead vs bare-metal