Skip to main content
Vincony
Developer Docs

API Reference

Access 40+ AI models through a single, OpenAI-compatible API. Chat completions, image generation, video, TTS, search, and data extraction — all on one key.

Base URL: https://api.vincony.com/v1

Why Use the Vincony API?

One integration, every AI model — with the controls you need to ship confidently.

One Key, 400+ Models

Access every major AI provider through a single API key and billing system.

OpenAI-Compatible

Drop-in replacement that works with any OpenAI SDK — switch models by changing a string.

Built-in Cost Control

Per-key credit budgets, real-time usage analytics, and overspend alerts.

Popular Use Cases

SaaS Products

Add AI chat, summarization, or content generation to your app without managing multiple provider accounts.

Internal Tools

Build Slack bots, email drafters, or data analyzers powered by the best model for each task.

Content Pipelines

Automate blog posts, social media copy, and product descriptions at scale with batch generation.

Code Assistants

Integrate code generation, review, and debugging into CI/CD pipelines or IDE extensions.

Customer Support Bots

Deploy AI chatbots on any website using the chatbot embed endpoint and a single script tag.

Research & Analysis

Run queries across multiple models simultaneously for comparison and consensus.

Quickstart

Get up and running in under 2 minutes:

1

Get your API key

Sign up and generate an API key from the Developer Portal. Keys start with vncy_.

2

Make your first request

The API is OpenAI-compatible — use any HTTP client or the OpenAI SDK:

curl -X POST https://api.vincony.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer vncy_YOUR_API_KEY" \
-d '{"model":"openai/gpt-5-mini","messages":[{"role":"user","content":"Say hello!"}]}'
3

Parse the response

The response follows the OpenAI Chat Completions format:

{
"choices": [{ "message": { "content": "Hello! How can I help you?" } }]
}

API Overview

Base URLhttps://api.vincony.com/v1
Native Gateway URLhttps://api.vincony.com/api-gateway
Content-Typeapplication/json
AuthenticationBearer vncy_...
Response FormatJSON (except TTS which returns audio/mpeg)
API Versionv1 (stable, no breaking changes planned)

Authentication

All API requests require a Bearer token in the Authorization header. API keys start with the vncy_ prefix.

Authorization: Bearer vncy_YOUR_API_KEY

Generate keys from the Developer Portal.

Each key has configurable scopes that limit which endpoints it can access:

ScopeGrants Access To
chatChat Completions (/v1/chat/completions)
imageImage Generation (generate-image-aiml)
videoVideo Generation (generate-video)
ttsText-to-Speech (text-to-speech)
searchAI Search (ai-search)
extractData Extraction (extract-data)
Key security: Keys are hashed server-side with SHA-256. The plaintext key is shown only once at creation. If compromised, revoke it immediately from the Developer Portal.

Endpoints

The API provides two interfaces: an OpenAI-compatible endpoint at /v1/chat/completions and a native gateway at /api-gateway for image, video, TTS, search, and extraction.

EndpointMethodPathScopeCost
Chat Completions
POST
/v1/chat/completionschat1–10 credits (varies by model)
List Models
GET
/v1/models0 credits
Image Generation
POST
/api-gatewayimage3–5 credits (varies by model)
Video Generation
POST
/api-gatewayvideo15–200 credits (varies by model)
Text-to-Speech
POST
/api-gatewaytts2 credits
AI Search
POST
/api-gatewaysearch3 credits
Data Extraction
POST
/api-gatewayextract2 credits

Models & Pricing

Model IDProviderCategoryCredits / req
google/gemini-3-flash-previewGoogle
General
1
google/gemini-3-pro-previewGoogle
General
4
google/gemini-2.5-flashGoogle
General
1
google/gemini-2.5-flash-liteGoogle
General
1
google/gemini-2.5-proGoogle
General
3
google/gemini-2.5-flash-imageGoogle
General
2
google/gemini-3-pro-image-previewGoogle
General
4
openai/gpt-5OpenAI
General
3
openai/gpt-5-miniOpenAI
General
2
openai/gpt-5-nanoOpenAI
General
1
openai/gpt-5.2OpenAI
General
5
openai/o4-miniOpenAI
General
2
anthropic/claude-sonnet-4.5Anthropic
General
3
anthropic/claude-sonnet-4Anthropic
General
3
anthropic/claude-opus-4.5Anthropic
General
8
anthropic/claude-opus-4Anthropic
General
10
x-ai/grok-4xAI
General
3
deepseek/deepseek-v3.2DeepSeek
General
1
meta/llama-4-maverickMeta
General
2
mistral/mistral-largeMistral
General
2
openai/gpt-5.2-codexOpenAI
Code
4
openai/gpt-5.1-codex-maxOpenAI
Code
4
openai/gpt-5-codexOpenAI
Code
3
openai/gpt-5.1-codexOpenAI
Code
3
openai/gpt-5.1-codex-miniOpenAI
Code
1
openai/codex-miniOpenAI
Code
1
mistral/devstral-2Mistral
Code
3
mistral/devstral-small-2Mistral
Code
1
mistral/devstral-smallMistral
Code
1
mistral/codestralMistral
Code
2
alibaba/qwen3-coder-plusAlibaba
Code
3
alibaba/qwen3-coderAlibaba
Code
2
openai/o3-proOpenAI
Reasoning
10
openai/o3OpenAI
Reasoning
4
openai/o3-miniOpenAI
Reasoning
2
openai/o1OpenAI
Reasoning
7
openai/gpt-5.1-thinkingOpenAI
Reasoning
4
deepseek/deepseek-r1DeepSeek
Reasoning
4
mistral/magistral-mediumMistral
Reasoning
2
mistral/magistral-smallMistral
Reasoning
1
x-ai/grok-4.1-fast-rxAI
Reasoning
2
alibaba/qwen3-max-thinkingAlibaba
Reasoning
5

Rate Limits

PlanRequests / MinuteRequests / Day
Power605,000
Business12020,000

Rate limit status is returned in response headers:

  • X-RateLimit-Remaining-RPM — remaining requests this minute
  • X-RateLimit-Remaining-RPD — remaining requests today

Per-Key Credit Budgets

Each API key can optionally have a monthly credit budget. When the budget is exhausted, all requests return 429 until the budget resets at the start of the next calendar month.

At 80% consumption, a webhook alert is fired to the key's configured budget_webhook_url. Configure budgets in the Developer Portal.

Error Codes

CodeStatusDescription
400Bad RequestInvalid JSON body, missing required fields, or malformed parameters.
401UnauthorizedMissing or invalid API key. Ensure your Authorization header uses a valid vncy_ key.
402Payment RequiredInsufficient credits. Upgrade your plan or purchase a top-up.
403ForbiddenAPI key revoked, insufficient scope for the requested endpoint, or plan upgrade required.
429Rate LimitedExceeded RPM or RPD limits, or monthly per-key credit budget exhausted. Check the error message for details.
500Internal Server ErrorUnexpected server error. Contact support if this persists.
502Bad GatewayUpstream AI provider is temporarily unavailable. Retry with exponential back-off.

OpenAI-Compatible Error Format

Returned by /v1/chat/completions:

{
"error": {
"message": "Rate limit exceeded (60 requests/minute).",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}

Native Gateway Error Format

Returned by /api-gateway:

{
"error": "Rate limit exceeded (60 requests/minute). Please slow down."
}
Note: A 402 status indicates credit/budget exhaustion, distinct from 429 rate limiting. Check the error message to distinguish between the two.

Webhooks

Receive real-time HTTP POST notifications when events occur in your account. Configure webhook endpoints from the Webhooks settings.

Setup

  1. Add an HTTPS endpoint URL in the Webhooks settings page.
  2. Select which events to subscribe to.
  3. A signing secret is generated automatically — use it to verify payloads.

Event Types

EventDescription
generation_completeFired when any content generation finishes (text, image, audio, video, 3D).
credit_thresholdFired when your credit balance drops below a warning threshold.
scheduled_completeFired when a scheduled generation job completes.
batch_completeFired when a batch generation run finishes.
api.request.completedFired after every successful API gateway request (includes endpoint, latency, credits).

Payload Format

Every webhook delivery is a POST request with these headers:

  • Content-Type: application/json
  • X-Webhook-Event — the event name
  • X-Webhook-Signature — HMAC-SHA256 hex digest of the body
{
"event": "generation_complete",
"timestamp": "2026-02-24T12:00:00.000Z",
"data": {
"model_id": "openai/gpt-5",
"model_category": "text",
"prompt_preview": "Write a blog post about...",
"credits_consumed": 3
}
}

Verifying Signatures (HMAC-SHA256)

import crypto from "crypto";
function verifyWebhook(body, secret, signature) {
const expected = crypto
.createHmac("sha256", secret)
.update(body)
.digest("hex");
return crypto.timingSafeEqual(
Buffer.from(expected),
Buffer.from(signature)
);
}

Webhooks time out after 10 seconds. Return a 2xx status to acknowledge receipt. Failed deliveries are not retried.

Streaming (SSE)

Set "stream": true in your chat completions request to receive Server-Sent Events. Each event contains a JSON chunk with a delta object. The stream terminates with data: [DONE].

curl -N -X POST https://api.vincony.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer vncy_YOUR_API_KEY" \
-d '{
"model": "openai/gpt-5",
"messages": [
{ "role": "user", "content": "Write a haiku about AI." }
],
"stream": true
}'
# Each SSE event is a JSON object prefixed with "data: "
# The stream ends with "data: [DONE]"
#
# data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Silicon"}}]}
# data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" dreams"}}]}
# ...
# data: [DONE]
Stream chunk format: Each SSE line is prefixed with data: followed by JSON. The delta.content field contains the incremental text. Accumulate these to build the full response.

Code Examples

curl -X POST https://api.vincony.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer vncy_YOUR_API_KEY" \
-d '{
"model": "openai/gpt-5",
"messages": [
{ "role": "user", "content": "Explain quantum computing in one sentence." }
],
"temperature": 0.7,
"max_tokens": 256
}'

SDKs & Compatibility

The /v1/chat/completions endpoint is fully compatible with the OpenAI SDK. Simply point the base URL to Vincony:

from openai import OpenAI
client = OpenAI(api_key="vncy_...", base_url="https://api.vincony.com/v1")
chat = client.chat.completions.create(
model="openai/gpt-5",
messages=[{"role": "user", "content": "Hello!"}],
)
print(chat.choices[0].message.content)

Compatible with: OpenAI Python SDK, OpenAI Node.js SDK, LangChain, LlamaIndex, Vercel AI SDK, and any tool targeting the OpenAI Chat Completions schema.

The native gateway endpoints (/api-gateway) use a Vincony-specific format and are accessed directly via HTTP.

API Changelog

The Vincony API is currently

v1
. Breaking changes are avoided whenever possible. When they do occur, we provide at least 30 days' notice via email and a deprecation header.

2026-02-24
Added
Automatic generation webhooks

All content generations (chat, image, video, audio, 3D) now fire the generation_complete webhook event.

about 1 month ago
2026-02-24
Added
Complete API documentation

Full endpoint reference with response schemas, multi-language examples, and quickstart guide.

about 1 month ago
2026-02-20
Added
Video generation (Veo 3.1, Kling v3)

New video generation endpoint with text-to-video and image-to-video models.

about 1 month ago
2026-02-15
Added
Per-key credit budgets & budget webhooks

API keys can now have monthly credit budgets with 80% threshold webhook alerts.

about 2 months ago
2026-02-01
Launch
API v1 public release

Chat completions, image generation, TTS, AI search, and data extraction endpoints.

about 2 months ago

Versioning policy: URL-based versioning (/v1/). New version only for breaking changes. Additive changes ship without a version bump.

Deprecation notices: Deprecated features include a Sunset response header with the removal date.

Ready to integrate?

Get your API key and start building in minutes.

Vincony — Access the World's Best AI Models