Developer Docs

API Reference

Access 800+ AI models through a single, OpenAI-compatible API. Chat completions, image generation, video, TTS, search, and data extraction — all on one key.

Get API Key Quickstart

Base URL: https://api.vincony.com/v1

Why Use the Vincony API?

One integration, every AI model — with the controls you need to ship confidently.

One Key, 800+ distinct models across 80+ providers

Access every major AI provider through a single API key and billing system.

OpenAI-Compatible

Drop-in replacement that works with any OpenAI SDK — switch models by changing a string.

Built-in Cost Control

Per-key credit budgets, real-time usage analytics, and overspend alerts.

Popular Use Cases

SaaS Products

Add AI chat, summarization, or content generation to your app without managing multiple provider accounts.

Internal Tools

Build Slack bots, email drafters, or data analyzers powered by the best model for each task.

Content Pipelines

Automate blog posts, social media copy, and product descriptions at scale with batch generation.

Code Assistants

Integrate code generation, review, and debugging into CI/CD pipelines or IDE extensions.

Customer Support Bots

Deploy AI chatbots on any website using the chatbot embed endpoint and a single script tag.

Research & Analysis

Run queries across multiple models simultaneously for comparison and consensus.

Quickstart

Get up and running in under 2 minutes:

Get your API key

Make your first request

The API is OpenAI-compatible — use any HTTP client or the OpenAI SDK:

curl -X POST https://api.vincony.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer vncy_YOUR_API_KEY" \
  -d '{"model":"openai/gpt-5-mini","messages":[{"role":"user","content":"Say hello!"}]}'

Parse the response

The response follows the OpenAI Chat Completions format:

{
  "choices": [{ "message": { "content": "Hello! How can I help you?" } }]
}

API Overview

Base URL	https://api.vincony.com/v1
Native Gateway URL	https://api.vincony.com/api-gateway
Content-Type	application/json
Authentication	Bearer vncy_...
Response Format	JSON (except TTS which returns audio/mpeg)
API Version	v1 (stable, no breaking changes planned)

Authentication

All API requests require a Bearer token in the Authorization header. API keys start with the vncy_ prefix.

Authorization: Bearer vncy_YOUR_API_KEY

Generate keys from the Developer Portal.

Each key has configurable scopes that limit which endpoints it can access:

Scope	Grants Access To
chat	Chat Completions (/v1/chat/completions)
image	Image Generation (generate-image-aiml)
video	Video Generation (generate-video)
tts	Text-to-Speech (text-to-speech)
search	AI Search (ai-search)
extract	Data Extraction (extract-data)

Key security: Keys are hashed server-side with SHA-256. The plaintext key is shown only once at creation. If compromised, revoke it immediately from the Developer Portal.

Endpoints

The API provides two interfaces: an OpenAI-compatible endpoint at /v1/chat/completions and a native gateway at /api-gateway for image, video, TTS, search, and extraction.

Endpoint	Method	Path	Scope	Cost
Chat Completions	POST	/v1/chat/completions	chat	1–10 credits (varies by model)
List Models	GET	/v1/models	—	0 credits
Image Generation	POST	/api-gateway	image	3–5 credits (varies by model)
Video Generation	POST	/api-gateway	video	15–200 credits (varies by model)
Text-to-Speech	POST	/api-gateway	tts	2 credits
AI Search	POST	/api-gateway	search	3 credits
Data Extraction	POST	/api-gateway	extract	2 credits

Chat Completions1–10 credits (varies by model)

List Models0 credits

Image Generation3–5 credits (varies by model)

Video Generation15–200 credits (varies by model)

Text-to-Speech2 credits

AI Search3 credits

Data Extraction2 credits

Models & Pricing

Model ID	Provider	Category	Credits / req
google/gemini-3-flash-preview	Google	General	1
google/gemini-3-pro-preview	Google	General	4
google/gemini-3-flash-preview	Google	General	1
google/gemini-3-flash-preview	Google	General	1
google/gemini-3-pro-preview	Google	General	3
google/gemini-2.5-flash-image	Google	General	2
google/gemini-3-pro-image-preview	Google	General	4
openai/gpt-5	OpenAI	General	3
openai/gpt-5-mini	OpenAI	General	2
openai/gpt-5-nano	OpenAI	General	1
openai/gpt-5.2	OpenAI	General	5
openai/o4-mini	OpenAI	General	2
anthropic/claude-sonnet-4.5	Anthropic	General	3
anthropic/claude-sonnet-4	Anthropic	General	3
anthropic/claude-opus-4.5	Anthropic	General	8
anthropic/claude-opus-4	Anthropic	General	10
x-ai/grok-4	xAI	General	3
deepseek/deepseek-v3.2	DeepSeek	General	1
meta/llama-4-maverick	Meta	General	2
mistral/mistral-large	Mistral	General	2
openai/gpt-5.2-codex	OpenAI	Code	4
openai/gpt-5.1-codex-max	OpenAI	Code	4
openai/gpt-5-codex	OpenAI	Code	3
openai/gpt-5.1-codex	OpenAI	Code	3
openai/gpt-5.1-codex-mini	OpenAI	Code	1
openai/codex-mini	OpenAI	Code	1
mistral/devstral-2	Mistral	Code	3
mistral/devstral-small-2	Mistral	Code	1
mistral/devstral-small	Mistral	Code	1
mistral/codestral	Mistral	Code	2
alibaba/qwen3-coder-plus	Alibaba	Code	3
alibaba/qwen3-coder	Alibaba	Code	2
openai/o3-pro	OpenAI	Reasoning	10
openai/o3	OpenAI	Reasoning	4
openai/o3-mini	OpenAI	Reasoning	2
openai/o1	OpenAI	Reasoning	7
openai/gpt-5.1-thinking	OpenAI	Reasoning	4
deepseek/deepseek-r1	DeepSeek	Reasoning	4
mistral/magistral-medium	Mistral	Reasoning	2
mistral/magistral-small	Mistral	Reasoning	1
x-ai/grok-4.1-fast-r	xAI	Reasoning	2
alibaba/qwen3-max-thinking	Alibaba	Reasoning	5

Rate Limits

Plan	Requests / Minute	Requests / Day
Power	60	5,000
Business	120	20,000

Rate limit status is returned in response headers:

X-RateLimit-Remaining-RPM — remaining requests this minute
X-RateLimit-Remaining-RPD — remaining requests today

Per-Key Credit Budgets

Each API key can optionally have a monthly credit budget. When the budget is exhausted, all requests return 429 until the budget resets at the start of the next calendar month.

At 80% consumption, a webhook alert is fired to the key's configured budget_webhook_url. Configure budgets in the Developer Portal.

Error Codes

Code	Status	Description
400	Bad Request	Invalid JSON body, missing required fields, or malformed parameters.
401	Unauthorized	Missing or invalid API key. Ensure your Authorization header uses a valid vncy_ key.
402	Payment Required	Insufficient credits. Upgrade your plan or purchase a top-up.
403	Forbidden	API key revoked, insufficient scope for the requested endpoint, or plan upgrade required.
429	Rate Limited	Exceeded RPM or RPD limits, or monthly per-key credit budget exhausted. Check the error message for details.
500	Internal Server Error	Unexpected server error. Contact support if this persists.
502	Bad Gateway	Upstream AI provider is temporarily unavailable. Retry with exponential back-off.

OpenAI-Compatible Error Format

Returned by /v1/chat/completions:

{
  "error": {
    "message": "Rate limit exceeded (60 requests/minute).",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Native Gateway Error Format

Returned by /api-gateway:

{
  "error": "Rate limit exceeded (60 requests/minute). Please slow down."
}

Note: A 402 status indicates credit/budget exhaustion, distinct from 429 rate limiting. Check the error message to distinguish between the two.

Webhooks

Receive real-time HTTP POST notifications when events occur in your account. Configure webhook endpoints from the Webhooks settings.

Setup

Add an HTTPS endpoint URL in the Webhooks settings page.
Select which events to subscribe to.
A signing secret is generated automatically — use it to verify payloads.

Event Types

Event	Description
generation_complete	Fired when any content generation finishes (text, image, audio, video, 3D).
credit_threshold	Fired when your credit balance drops below a warning threshold.
scheduled_complete	Fired when a scheduled generation job completes.
batch_complete	Fired when a batch generation run finishes.
api.request.completed	Fired after every successful API gateway request (includes endpoint, latency, credits).

Payload Format

Every webhook delivery is a POST request with these headers:

Content-Type: application/json
X-Webhook-Event — the event name
X-Webhook-Signature — HMAC-SHA256 hex digest of the body

{
  "event": "generation_complete",
  "timestamp": "2026-02-24T12:00:00.000Z",
  "data": {
    "model_id": "openai/gpt-5",
    "model_category": "text",
    "prompt_preview": "Write a blog post about...",
    "credits_consumed": 3
  }
}

Verifying Signatures (HMAC-SHA256)

import crypto from "crypto";

function verifyWebhook(body, secret, signature) {
  const expected = crypto
    .createHmac("sha256", secret)
    .update(body)
    .digest("hex");
  return crypto.timingSafeEqual(
    Buffer.from(expected),
    Buffer.from(signature)
  );
}

Webhooks time out after 10 seconds. Return a 2xx status to acknowledge receipt. Failed deliveries are not retried.

Streaming (SSE)

Set "stream": true in your chat completions request to receive Server-Sent Events. Each event contains a JSON chunk with a delta object. The stream terminates with data: [DONE].

curl -N -X POST https://api.vincony.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer vncy_YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-5",
    "messages": [
      { "role": "user", "content": "Write a haiku about AI." }
    ],
    "stream": true
  }'

# Each SSE event is a JSON object prefixed with "data: "
# The stream ends with "data: [DONE]"
#
# data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Silicon"}}]}
# data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" dreams"}}]}
# ...
# data: [DONE]

Stream chunk format: Each SSE line is prefixed with data: followed by JSON. The delta.content field contains the incremental text. Accumulate these to build the full response.

Code Examples

curl -X POST https://api.vincony.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer vncy_YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-5",
    "messages": [
      { "role": "user", "content": "Explain quantum computing in one sentence." }
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'

SDKs & Compatibility

The /v1/chat/completions endpoint is fully compatible with the OpenAI SDK. Simply point the base URL to Vincony:

from openai import OpenAI

client = OpenAI(api_key="vncy_...", base_url="https://api.vincony.com/v1")
chat = client.chat.completions.create(
    model="openai/gpt-5",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(chat.choices[0].message.content)

Compatible with: OpenAI Python SDK, OpenAI Node.js SDK, LangChain, LlamaIndex, Vercel AI SDK, and any tool targeting the OpenAI Chat Completions schema.

The native gateway endpoints (/api-gateway) use a Vincony-specific format and are accessed directly via HTTP.

API Changelog

The Vincony API is currently

. Breaking changes are avoided whenever possible. When they do occur, we provide at least 30 days' notice via email and a deprecation header.

2026-02-24

Added

Automatic generation webhooks

All content generations (chat, image, video, audio, 3D) now fire the generation_complete webhook event.

4 months ago

2026-02-24

Added

Complete API documentation

Full endpoint reference with response schemas, multi-language examples, and quickstart guide.

4 months ago

2026-02-20

Added

Video generation (Veo 3.1, Kling v3)

New video generation endpoint with text-to-video and image-to-video models.

4 months ago

2026-02-15

Added

Per-key credit budgets & budget webhooks

API keys can now have monthly credit budgets with 80% threshold webhook alerts.

5 months ago

2026-02-01

Launch

API v1 public release

Chat completions, image generation, TTS, AI search, and data extraction endpoints.

5 months ago

Versioning policy: URL-based versioning (/v1/). New version only for breaking changes. Additive changes ship without a version bump.

Deprecation notices: Deprecated features include a Sunset response header with the removal date.

Ready to integrate?

Get your API key and start building in minutes.

Get API Key View Plans