
We're excited to announce that Phala’s confidential AI infrastructure is now integrated into the OLLM Confidential AI. Developers can now access hardware-secured AI models with cryptographic privacy guarantees through a simple API call.
Private AI Cloud Built on Trusted Hardware
Phala is a private AI cloud that runs workloads inside hardware Trusted Execution Environments (TEEs) from Intel TDX, AMD SEV, and Nvidia H100/H200 GPUs. This confidential AI cloud architecture ensures that code, data, and model weights remain encrypted during execution—protecting sensitive information from unauthorized access, including by cloud providers.
Unlike traditional AI platforms that rely on policies and promises, Phala delivers verifiable privacy. Every inference workload generates cryptographic attestation that proves it ran on genuine TEE hardware. Users can verify these proofs through Phala's Trust Center, ensuring their data remained confidential throughout processing.
Production-Ready Confidential AI
The OLLM partnership makes Phala’s confidential computing infrastructure accessible to any developer building privacy-critical AI applications. Through OLLM's gateway, teams can now run frontier models on Nvidia H200 GPUs with Intel TDX and AMD SEV protection—with only 0.5% to 5% performance overhead.
Real-world use cases include:
- Financial services: Process sensitive transaction data with internal AI copilots
- Healthcare: Analyze patient records while maintaining HIPAA compliance
- Web3: Run on-chain analytics without exposing user identities
- Enterprise: Deploy AI agents that handle proprietary data confidentially
Phala has already processed over 1.34 billion LLM tokens in a single day, demonstrating production-scale readiness for confidential AI workloads.
Why Hardware Security Matters
Traditional cloud AI services process your data in plaintext, requiring you to trust the provider. Phala’s private AI cloud architecture eliminates this trust requirement:
- Intel TDX and AMD SEV create isolated, encrypted environments for model execution
- Nvidia H200 GPUs provide high-performance confidential computing for large language models
- Cryptographic attestation proves workloads ran inside genuine TEE hardware
- Data remains encrypted from input to output, with no plaintext exposure
"Enterprises want the benefits of modern AI, but they cannot compromise on data confidentiality or control," says Ahmad Shadid, CEO of OLLM. "We aggregate Phala's confidential AI cloud into the OLLM Confidential AI Gateway, giving builders a simple way to tap into hardware-secured models with verifiable privacy."
Get Started with Confidential AI
Phala makes confidential computing accessible without infrastructure complexity. Simply call the OLLM API, select a Phala-secured model, and receive cryptographically verifiable private inference results. No complex setup. No performance sacrifices. No blind trust required.
Visit OLLM.com to access Phala’s private AI cloud through the Confidential AI Gateway, or explore Phala.com to learn more about our Intel TDX, AMD SEV, and Nvidia H200-powered confidential computing infrastructure.
About OLLM
OLLM is a privacy-first AI gateway offering seamless access to hundreds of AI models. The platform democratizes enterprise-grade security by giving users the choice: deploy on standard infrastructure for speed and flexibility, or on confidential computing chips for hardware-encrypted, zero-knowledge security. OLLM is dedicated to building a future where AI is accessible without compromising on security or control.
About Phala
Phala is a zero-trust cloud for AI, providing confidential and verifiable compute for models and agents. Phala runs AI workloads inside hardware Trusted Execution Environments (TEEs), ensuring code, data, and keys remain private while producing cryptographically verifiable outputs. By combining remote attestation, on-chain proofs, and a developer-friendly cloud platform, Phala turns “trust me” into “prove it,” enabling enterprises and developers to deploy secure, privacy-preserving AI at scale.