Private Training

Train on data that never leaves the silo.

SFT · DPO · RLHF · PEFT · continued pre-training. Your stack, sealed end-to-end. Signed manifest out, no logs by construction.

Read training docs Talk to experts

6 parties · 1 sealed model

Hospital AHospital BBank CLab DInsurer EResearch F

6 sealed parties

attested merge

1 signed model

Every phase of your training stack — sealed.

SFT, DPO, RLHF, LoRA / QLoRA / PEFT, continued pre-training, and multimodal extension. Run the pipeline you already use with TRL, Unsloth, and HuggingFace — on data that never leaves the silo.

training pipelines · single-tenant

fork · register · seal

phase

pipeline

base · method

hardware

status

SFT

Instruction-tuned assistant

Llama 3.1 8B
TRL · full SFT

H200 ×1

ready

DPO

Helpfulness preference align

Llama 3.1 8B-SFT
TRL DPOTrainer

H200 ×1

ready

RLHF

Safety reward + PPO

Mistral 7B-SFT
reward model + PPO

H200 ×2

busy

LoRA

Domain coding copilot

Qwen2.5-Coder 14B
PEFT · LoRA r=32

H200 ×2

ready

QLoRA

Medical chat (memory-fit)

Llama 3.1 70B
QLoRA 4-bit

H200 ×4

busy

Continued PT

Legal corpus adapter

Llama 3.1 8B
unsloth · 1.2B tokens

H100 ×4

ready

SFT

Domain RAG embedder

BGE-m3
contrastive · in-batch

H100 ×1

ready

Multimodal

Vision-language adapter

Llama 3.2 11B-V
projector + SFT

H200 ×2

forming

cross-silo consortia · multi-party

multi-sig · k-of-n unwrap

consortium

records

training method

status

Cardio-renal cohort

4 hospitals · EU + US

1.6M rows

SFT · risk classifier

live

Cross-bank fraud signals

6 banks · US + UK + SG

78M tx

RLHF preference · joint

live

Rare-disease genomics

3 research consortiums

54k samples

continued PT · embedder

live

Supply-chain risk benchmark

8 vendors · global

12M records

DPO · ranked outcomes

forming

How it works

Walk a sealed training run end to end.

Toggle dstack off to see exactly which guarantee disappears at each step.

Sealed Training on a TEE-GPU Cluster

SFT · DPO · RLHF · PEFT · continued PT — encrypted gradients across attested H200 nodes, signed manifest out

active edgeinactivenew in step

step 1 of 5 · scroll-aware

Step 1 / 5

Owners Seal Datasets · Compose-Hash Registered

Every data owner derives a wrap key locally: HKDF(kms_root_pubkey, training_app_id, training_compose_hash, owner_id), encrypts their shard, and publishes ciphertext. The training compose-hash pins the framework, training script, and hyperparameter file (TRL · Unsloth · HF Trainer). Approval is co-registered on-chain via a multi-sig that owns DstackApp.sol.

With dstack: Owners only ever release ciphertext sealed against a future attested build they reviewed.

One pipeline per training phase. Same stack you already use.

Supervised fine-tuning

TRL SFTTrainer or Unsloth. unwrap_dataset() pulls per-owner keys via dstack-guest-agent — your training loop is unchanged. Out: a sealed checkpoint plus a signed manifest.

SFT

from trl import SFTTrainerds = unwrap_dataset("sealed/sft.tar")tr = SFTTrainer(model, tokenizer,train_dataset=ds, ...)tr.train()phala.sign_manifest(tr.state)

DPO · RLHF

from trl import DPOTrainerprefs = unwrap_dataset("sealed/prefs")tr = DPOTrainer(policy, ref,beta=0.1, train=prefs)tr.train()# → aligned policy + manifest

Preference / RL alignment

DPOTrainer / IPOTrainer for preference-pair optimization, or full RLHF with a reward model + PPO. Sealed prompts, sealed preference data, attested reward model.

PEFT · LoRA / QLoRA

HuggingFace PEFT. Train low-rank adapters against a frozen base; the LoRA weights are sealed to the compose-hash and merged on attested re-derive. 4-bit QLoRA for memory-bound runs.

LoRA · QLoRA

from peft import LoraConfigcfg = LoraConfig(r=32,target_modules="q,k,v,o")model = get_peft_model(base, cfg)trainer.train()# sealed adapters · 0.4% params

CONTINUED PT

$ phala deploy \-c docker-compose.yml \-n llama-cpt \-t h200.small --kms phala→ stream-unwrap in TDX memory✓ manifest · token-hashes signed

Continued pre-training

Domain-adapt a base model on sealed token corpora. Streaming dataloader unwraps shards in TDX memory; the run emits one signed manifest covering token-hashes, hyperparameters, and final checkpoint.

sealed dataset · cohort-A.tar

1.6M rows

ownerhospital-Aanalysis-app-id0x4f6a…91c0analysis-compose-hash0xa42b…d1f3wrap-keyHKDF(kms, app, compose, owner)algoAES-256-GCM

SealedHIPAAGDPRnever-exits-silorecipe-bound

Sealed at source · per-owner HKDF

Each owner's wrap key is HKDF(kms_root, app_id, compose_hash, owner_id). Change the recipe → key changes, old ciphertext is permanently locked out. The wrap key only re-derives inside an attested CVM whose compose-hash matches.

training manifest · sign-rpc

signed

modelllama-3.1-8b · LoRA r=16datasets2 sealed · 1.23M rowsdataset_hashes0xab12…d5, 0x4ef3…21compose_hash0xa1b2…d1f3model_checksum0x9c1a…f7e2sigchains TDX root + on-chain

verify offlineregulator-grade artifact

Signed training manifest = audit artifact

Out of every run: a Sign-RPC manifest binding code, datasets, hyperparameters, and model checksum to the on-chain compose-hash. Hand the regulator the manifest, not the model file.

in production today

Sealed training, in production.

Recipe-bound fine-tunes that ship a sealed model. Cross-silo training where the manifest, not the model file, is the regulator artifact.

01fine-tune · live

Health tech · medical chat

HIPAA-grade · 13B QLoRA

“Patient-conversation fine-tune in TDX + H200. Weights never touch our cloud or theirs — sealed to the recipe, KMS re-derives only inside an attested CVM.”

medical-chat v3

TDX + H200 · weights sealed

02multi-party · live

Cardio-renal cohort study

4 hospitals · EU + US + CH

“Each hospital seals their dataset; KMS only unlocks for the approved compose. Out the other side: a signed manifest that binds code, datasets, and model checksum.”

1.6M records

multi-sig DstackApp · DP-aggregate out

03multi-party · live

Cross-bank fraud signals

6 banks · global

“Joint AML model trained without any bank seeing another bank's ledger. Hand the regulator the manifest, not the model file.”

78M tx records

k-of-n quorum · sign-rpc manifest

HIPAA-grade

patient data sealed

GDPR / UK GDPR

data residency preserved

PCI / FFIEC

cross-bank joins gated

SOC 2 Type II

attested run history

AI solution paths

Use private models where AI touches secrets.

The private model endpoint is the first entry point. The same privacy primitive extends to agents, data workflows, and training.

LLM API

Private AI inference

Serve OpenAI-compatible model calls where prompts, outputs, and customer context need encrypted-in-use protection.

Open solution

encrypted

DeepSeek V3.1

128K

$0.27/M input

encrypted

Qwen3 Coder

256K

$0.40/M input

encrypted

Llama 3.3 70B

128K

$0.15/M input

encrypted

GPT OSS 120B

128K

$0.10/M input

encrypted

Claude Sonnet 4.5

200K

$3.00/M input

encrypted

Gemini 2.5 Pro

$1.25/M input

Agents

Private AI agents

Run agents with keys, tools, memory, and actions inside a verified runtime instead of a visible automation cloud.

Open solution

Data

Private AI data

Move models to sensitive records and return approved outputs without exposing raw data to the model operator.

Open solution

source

EHR data

source

Customer records

source

Internal docs

TEE clean room

query without raw access

approved output

aggregate only

no row exportproof linked

Deploy private training

Train on sealed data. Hand a regulator the manifest.

Fine-tune for one customer or co-train across silos. Per-owner HKDF, multi-sig DstackApp gate, signed manifest binding code, datasets, and model. TDX + H100/H200.

View docs Talk to sales

01LoRA / QLoRA via Unsloth
02Owner-side HKDF dataset sealing
03Multi-sig DstackApp co-approval
04Sign-RPC training manifest
05On-chain compose-hash revocation
06Combined CPU + GPU TEE attestation

Private execution. Verifiable results.

Newsletter