AI Gateway
Unified multi-provider AI API — route to OpenAI, Anthropic, Google, Groq, Mistral, and DeepSeek with a single endpoint. Chat completions, streaming, embeddings, vector store (pgvector), model arena, and AI-powered database tools.
Bring Your Own Key (BYOK)
Add your own API keys for any supported provider. Keys are encrypted at rest using Fernet (AES-128-CBC + HMAC-SHA256). Each project manages its own provider keys independently. All requests are logged with token counts, latency, and cost for usage analytics.
Capabilities
Chat Completions
Standard + streaming (SSE) chat across all providers with normalized response format.
Embeddings + pgvector
Generate embeddings, store in managed collections, and run similarity search — all in your project's PostgreSQL.
Model Arena
Compare 2–4 models side-by-side in parallel. Evaluate quality, latency, and cost across providers.
Zy Assistant
Context-aware developer assistant that auto-gathers your project schema, auth config, and storage before answering.
AI Database Tools
Generate schemas, RLS policies, and SQL queries from natural language. Apply with approval-gated execution.
Public API
Call AI from your frontend/app via x-api-key header — no dashboard auth needed.
Supported Providers
| Provider | Models | Features |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1, gpt-4-turbo | ChatStreamEmbed |
| Anthropic | claude-sonnet-4, claude-3.5-sonnet, claude-3-haiku | ChatStream |
| Google AI | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash | ChatStream |
| Groq | llama-3.3-70b, mixtral-8x7b, gemma2-9b | ChatStream |
| Mistral | mistral-large, mistral-small, open-mixtral | ChatStreamEmbed |
| DeepSeek | deepseek-chat, deepseek-coder, deepseek-reasoner | ChatStream |
Manage Provider Keys
// Add or update a provider key
POST /projects/{project_id}/ai/providers
{
"provider": "openai",
"api_key": "sk-..."
}
// List configured providers (keys masked)
GET /projects/{project_id}/ai/providers
// Test provider connectivity
POST /projects/{project_id}/ai/providers/openai/test
// Remove a provider
DELETE /projects/{project_id}/ai/providers/openai
// List all available models
GET /projects/{project_id}/ai/modelsChat Completions
{
"model": "gpt-4o-mini",
"messages": [
{ "role": "system", "content": "You are a helpful assistant" },
{ "role": "user", "content": "What is BaaS?" }
],
"temperature": 0.7,
"max_tokens": 500
}
// Response
{
"content": "BaaS stands for Backend as a Service...",
"model": "gpt-4o-mini",
"provider": "openai",
"usage": {
"prompt_tokens": 25,
"completion_tokens": 120
},
"latency_ms": 1200
}Streaming (SSE)
Use the streaming endpoint for real-time token delivery via Server-Sent Events:
// Same request body as /chat
{
"model": "claude-sonnet-4",
"messages": [
{ "role": "user", "content": "Explain microservices" }
]
}
// SSE Response chunks
data: {"chunk": "Micro", "elapsed_ms": 45}
data: {"chunk": "services", "elapsed_ms": 52}
data: {"chunk": " are", "elapsed_ms": 58}
...
data: {"done": true, "ttft_ms": 45, "total_tokens": 230}const response = await fetch("/projects/{id}/ai/chat/stream", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ model: "gpt-4o", messages })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = JSON.parse(decoder.decode(value));
console.log(chunk.chunk); // append to UI
}Public API (API Key Auth)
Call AI directly from your app without dashboard authentication:
curl -X POST https://api.zmesh.in/api/ai/chat \
-H "x-api-key: zb_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'Model Arena
Compare 2–4 models side-by-side in a single request. All models run in parallel:
{
"messages": [
{ "role": "user", "content": "Explain REST vs GraphQL" }
],
"models": ["gpt-4o", "claude-sonnet-4", "gemini-2.0-flash"]
}
// Response — parallel results from all models
{
"results": [
{
"model": "gpt-4o",
"provider": "openai",
"content": "REST uses resource-based URLs...",
"latency_ms": 1100,
"tokens": { "prompt": 12, "completion": 200 }
},
{
"model": "claude-sonnet-4",
"provider": "anthropic",
"content": "The key differences between...",
"latency_ms": 980,
"tokens": { "prompt": 12, "completion": 185 }
},
{
"model": "gemini-2.0-flash",
"provider": "google",
"content": "REST and GraphQL are both...",
"latency_ms": 650,
"tokens": { "prompt": 12, "completion": 170 }
}
]
}Embeddings
{
"input": "Hello world",
"model": "text-embedding-3-small"
}
// Response
{
"embeddings": [[0.0023, -0.0094, 0.0156, ...]],
"dimensions": 1536,
"model": "text-embedding-3-small"
}Vector Store (pgvector)
Store and search vector embeddings directly in your project's PostgreSQL database using managed collections.
Managed Collections
Collections are PostgreSQL tables with a vector(N) column, auto-created and indexed.
Auto-Embedding
Upsert documents with text — embeddings are generated automatically using your configured provider.
Cosine Similarity
Search uses the <=> cosine distance operator for fast, accurate similarity matching.
// 1. Enable pgvector extension
POST /projects/{project_id}/ai/vectors/setup
// 2. Create a collection
POST /projects/{project_id}/ai/vectors/collections
{
"name": "documents",
"dimensions": 1536
}
// 3. List collections
GET /projects/{project_id}/ai/vectors/collections
// 4. Upsert documents (auto-embeds text)
POST /projects/{project_id}/ai/vectors/collections/documents/upsert
{
"documents": [
{
"content": "zMesh is an AI-first BaaS platform",
"metadata": { "type": "docs", "page": "intro" }
},
{
"content": "Edge Functions run in sandboxed subprocesses",
"metadata": { "type": "docs", "page": "functions" }
}
],
"model": "text-embedding-3-small"
}
// 5. Semantic search
POST /projects/{project_id}/ai/vectors/collections/documents/search
{
"query": "How do serverless functions work?",
"limit": 5,
"filter": { "type": "docs" }
}
// Response
{
"results": [
{
"id": "...",
"content": "Edge Functions run in sandboxed subprocesses",
"similarity": 0.92,
"metadata": { "type": "docs", "page": "functions" }
}
]
}Zy — Developer Assistant
Context-aware AI assistant that automatically gathers your project's database schema, auth config, and storage setup before responding. Responses are cached for 5 minutes to avoid redundant context lookups.
{
"message": "How do I set up RLS for my posts table?"
}
// Zy auto-gathers:
// - Your database schema (tables, columns, types)
// - Auth configuration (providers, settings)
// - Storage buckets and policies
// Then responds with project-specific guidance
// Response
{
"response": "Based on your schema, your posts table has...",
"context_used": ["schema", "auth"],
"cached": false
}AI Database Tools
Schema Generator
Generate complete database schemas from a natural language description:
{
"prompt": "Blog with users, posts, comments, and tags"
}
// Response — generated SQL
{
"sql": "CREATE TABLE users (\n id UUID PRIMARY KEY DEFAULT ...",
"tables": ["users", "posts", "comments", "tags", "post_tags"],
"explanation": "Created a blog schema with..."
}
// Apply (requires explicit approval)
POST /projects/{project_id}/ai/schema/apply
{
"sql": "CREATE TABLE ...",
"approved": true
}🛡️ Schema Apply Security
- • Requires
approved: truein the request body - • Only allows
CREATE TABLE,CREATE INDEX, andALTER TABLE - • Blocks dangerous keywords:
DROP,GRANT,COPY,TRUNCATE - • Enforces single-statement execution to prevent SQL chaining
RLS Policy Generator
{
"prompt": "Users can only read and update their own posts"
}
// Response
{
"sql": "ALTER TABLE posts ENABLE ROW LEVEL SECURITY;\nCREATE POLICY ...",
"policies": ["posts_select_own", "posts_update_own"],
"explanation": "Created two RLS policies..."
}
// Apply
POST /projects/{project_id}/ai/rls/apply
{ "sql": "...", "approved": true }SQL Query Builder
{
"prompt": "Show me all users who signed up this week",
"execute": true
}
// Response
{
"sql": "SELECT * FROM users WHERE created_at >= NOW() - INTERVAL '7 days'",
"explanation": "Fetches users created in the last 7 days",
"results": [
{ "id": "...", "name": "Rahul", "created_at": "2026-03-20T..." }
]
}API Reference
| Method | Path | Description |
|---|---|---|
| GET | /ai/providers | List configured providers |
| POST | /ai/providers | Add/update provider key |
| DELETE | /ai/providers/{provider} | Remove provider key |
| POST | /ai/providers/{provider}/test | Test provider connectivity |
| GET | /ai/models | List available models |
| POST | /ai/chat | Chat completion |
| POST | /ai/chat/stream | Streaming chat (SSE) |
| POST | /ai/arena | Compare 2–4 models in parallel |
| POST | /ai/embeddings | Generate embeddings |
| POST | /ai/vectors/setup | Enable pgvector extension |
| POST | /ai/vectors/collections | Create collection |
| GET | /ai/vectors/collections | List collections |
| POST | /ai/vectors/collections/{c}/upsert | Upsert documents (auto-embed) |
| POST | /ai/vectors/collections/{c}/search | Semantic similarity search |
| POST | /ai/schema/generate | Generate DB schema from prompt |
| POST | /ai/schema/apply | Apply generated schema |
| POST | /ai/rls/generate | Generate RLS policies from prompt |
| POST | /ai/rls/apply | Apply generated RLS policies |
| POST | /ai/sql/generate | Generate + optionally execute SQL |
| GET | /ai/usage | Usage analytics & logs |
| POST | /api/ai/chat | Public chat (API key auth) |
| POST | /zy/chat | Zy developer assistant |
Usage Analytics
Every AI request is logged with model, provider, token counts, and latency. Query usage via API:
// Usage stats (last 30 days)
GET /projects/{project_id}/ai/usage?days=30
// Response
{
"total_requests": 1250,
"total_tokens": { "prompt": 450000, "completion": 320000 },
"by_provider": {
"openai": { "requests": 800, "tokens": 500000 },
"anthropic": { "requests": 350, "tokens": 220000 },
"google": { "requests": 100, "tokens": 50000 }
},
"recent_logs": [...]
}