Models
ReadyGPU TEE

Phala: Venice Uncensored 24B

Model IDphala/uncensored-24b

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.

input

$0.20/M

output

$0.90/M

context

33K

created

Jan 9, 2026

Supported API shape

input

text

output

text

tools

Not listed

json mode

Not listed

Verification

signature

response ID

attestation

GPU TEE

provider

1 routes

Providers

phala

GPU TEE

input

$0.20/M

output

$0.90/M

context

33K

More models

Other private inference routes.

View catalog
encrypted

Qwen: Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

context

262K

input

$0.30/M

encrypted

Z.AI: GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

context

203K

input

$0.10/M

encrypted

Qwen: Qwen3 Embedding 8B

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

context

33K

input

$0.01/M

encrypted

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.

context

128K

input

$0.20/M