CE
Cerebras
Text
Llama 3.2 1B (Cerebras)
Fastest possible Llama inference on Cerebras hardware.
API Usage
1 curl -X POST https://api.vincony.com/v1/chat/completions \ 2 -H "Authorization: Bearer YOUR_API_KEY" \ 3 -H "Content-Type: application/json" \ 4 -d '{ 5 "model": "cerebras/llama-3.2-1b", 6 "messages": [ 7 { "role": "user", "content": "Hello, Llama 3.2 1B (Cerebras)!" } 8 ] 9 }'
Replace YOUR_API_KEY with your Vincony API key. OpenAI-compatible endpoint — works with any OpenAI SDK.
Compare with Another Model
Frequently Asked Questions
Try Llama 3.2 1B (Cerebras) now
Start using Llama 3.2 1B (Cerebras) instantly — 100 free credits, no credit card required. Access 801+ AI models through one platform.
More from Cerebras
Use ← → to navigate between models · Esc to go back
LFM2-24B
Text
Liquid Foundation Model running on Cerebras hardware for fast inference.
LFM 40B
Text
Liquid Foundation Model 40B running at high speed on Cerebras.
Llama 3.1 70B (Cerebras)
Text
Ultra-fast Llama 3.1 70B inference on Cerebras silicon.
Llama 3.3 70B (Cerebras)
Text
High-speed Llama 3.3 70B on Cerebras wafer-scale engine.