AI Gateway

Unified multi-provider AI API — route to OpenAI, Anthropic, Google, Groq, Mistral, and DeepSeek with a single endpoint. Chat completions, streaming, embeddings, vector store (pgvector), model arena, and AI-powered database tools.

🔑

Bring Your Own Key (BYOK)

Add your own API keys for any supported provider. Keys are encrypted at rest using Fernet (AES-128-CBC + HMAC-SHA256). Each project manages its own provider keys independently. All requests are logged with token counts, latency, and cost for usage analytics.

Capabilities

💬

Chat Completions

Standard + streaming (SSE) chat across all providers with normalized response format.

📐

Embeddings + pgvector

Generate embeddings, store in managed collections, and run similarity search — all in your project's PostgreSQL.

⚔️

Model Arena

Compare 2–4 models side-by-side in parallel. Evaluate quality, latency, and cost across providers.

🧠

Zy Assistant

Context-aware developer assistant that auto-gathers your project schema, auth config, and storage before answering.

🏗️

AI Database Tools

Generate schemas, RLS policies, and SQL queries from natural language. Apply with approval-gated execution.

🌐

Public API

Call AI from your frontend/app via x-api-key header — no dashboard auth needed.

Supported Providers

ProviderModelsFeatures
OpenAIgpt-4o, gpt-4o-mini, o1, gpt-4-turboChatStreamEmbed
Anthropicclaude-sonnet-4, claude-3.5-sonnet, claude-3-haikuChatStream
Google AIgemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flashChatStream
Groqllama-3.3-70b, mixtral-8x7b, gemma2-9bChatStream
Mistralmistral-large, mistral-small, open-mixtralChatStreamEmbed
DeepSeekdeepseek-chat, deepseek-coder, deepseek-reasonerChatStream

Manage Provider Keys

Provider Key ManagementEncrypted
// Add or update a provider key
POST /projects/{project_id}/ai/providers
{
  "provider": "openai",
  "api_key": "sk-..."
}

// List configured providers (keys masked)
GET /projects/{project_id}/ai/providers

// Test provider connectivity
POST /projects/{project_id}/ai/providers/openai/test

// Remove a provider
DELETE /projects/{project_id}/ai/providers/openai

// List all available models
GET /projects/{project_id}/ai/models

Chat Completions

POST /projects/{project_id}/ai/chatAuth Required
{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant" },
    { "role": "user", "content": "What is BaaS?" }
  ],
  "temperature": 0.7,
  "max_tokens": 500
}

// Response
{
  "content": "BaaS stands for Backend as a Service...",
  "model": "gpt-4o-mini",
  "provider": "openai",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 120
  },
  "latency_ms": 1200
}

Streaming (SSE)

Use the streaming endpoint for real-time token delivery via Server-Sent Events:

POST /projects/{project_id}/ai/chat/streamSSE Stream
// Same request body as /chat
{
  "model": "claude-sonnet-4",
  "messages": [
    { "role": "user", "content": "Explain microservices" }
  ]
}

// SSE Response chunks
data: {"chunk": "Micro", "elapsed_ms": 45}
data: {"chunk": "services", "elapsed_ms": 52}
data: {"chunk": " are", "elapsed_ms": 58}
...
data: {"done": true, "ttft_ms": 45, "total_tokens": 230}
Frontend UsageJavaScript
const response = await fetch("/projects/{id}/ai/chat/stream", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ model: "gpt-4o", messages })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const chunk = JSON.parse(decoder.decode(value));
  console.log(chunk.chunk); // append to UI
}

Public API (API Key Auth)

Call AI directly from your app without dashboard authentication:

POST /api/ai/chatAPI Key
curl -X POST https://api.zmesh.in/api/ai/chat \
  -H "x-api-key: zb_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Model Arena

Compare 2–4 models side-by-side in a single request. All models run in parallel:

POST /projects/{project_id}/ai/arenaArena
{
  "messages": [
    { "role": "user", "content": "Explain REST vs GraphQL" }
  ],
  "models": ["gpt-4o", "claude-sonnet-4", "gemini-2.0-flash"]
}

// Response — parallel results from all models
{
  "results": [
    {
      "model": "gpt-4o",
      "provider": "openai",
      "content": "REST uses resource-based URLs...",
      "latency_ms": 1100,
      "tokens": { "prompt": 12, "completion": 200 }
    },
    {
      "model": "claude-sonnet-4",
      "provider": "anthropic",
      "content": "The key differences between...",
      "latency_ms": 980,
      "tokens": { "prompt": 12, "completion": 185 }
    },
    {
      "model": "gemini-2.0-flash",
      "provider": "google",
      "content": "REST and GraphQL are both...",
      "latency_ms": 650,
      "tokens": { "prompt": 12, "completion": 170 }
    }
  ]
}

Embeddings

POST /projects/{project_id}/ai/embeddingsEmbeddings
{
  "input": "Hello world",
  "model": "text-embedding-3-small"
}

// Response
{
  "embeddings": [[0.0023, -0.0094, 0.0156, ...]],
  "dimensions": 1536,
  "model": "text-embedding-3-small"
}

Vector Store (pgvector)

Store and search vector embeddings directly in your project's PostgreSQL database using managed collections.

Managed Collections

Collections are PostgreSQL tables with a vector(N) column, auto-created and indexed.

Auto-Embedding

Upsert documents with text — embeddings are generated automatically using your configured provider.

Cosine Similarity

Search uses the <=> cosine distance operator for fast, accurate similarity matching.

Vector Store Workflowpgvector
// 1. Enable pgvector extension
POST /projects/{project_id}/ai/vectors/setup

// 2. Create a collection
POST /projects/{project_id}/ai/vectors/collections
{
  "name": "documents",
  "dimensions": 1536
}

// 3. List collections
GET /projects/{project_id}/ai/vectors/collections

// 4. Upsert documents (auto-embeds text)
POST /projects/{project_id}/ai/vectors/collections/documents/upsert
{
  "documents": [
    {
      "content": "zMesh is an AI-first BaaS platform",
      "metadata": { "type": "docs", "page": "intro" }
    },
    {
      "content": "Edge Functions run in sandboxed subprocesses",
      "metadata": { "type": "docs", "page": "functions" }
    }
  ],
  "model": "text-embedding-3-small"
}

// 5. Semantic search
POST /projects/{project_id}/ai/vectors/collections/documents/search
{
  "query": "How do serverless functions work?",
  "limit": 5,
  "filter": { "type": "docs" }
}

// Response
{
  "results": [
    {
      "id": "...",
      "content": "Edge Functions run in sandboxed subprocesses",
      "similarity": 0.92,
      "metadata": { "type": "docs", "page": "functions" }
    }
  ]
}

Zy — Developer Assistant

Context-aware AI assistant that automatically gathers your project's database schema, auth config, and storage setup before responding. Responses are cached for 5 minutes to avoid redundant context lookups.

POST /projects/{project_id}/zy/chatZy
{
  "message": "How do I set up RLS for my posts table?"
}

// Zy auto-gathers:
// - Your database schema (tables, columns, types)
// - Auth configuration (providers, settings)
// - Storage buckets and policies
// Then responds with project-specific guidance

// Response
{
  "response": "Based on your schema, your posts table has...",
  "context_used": ["schema", "auth"],
  "cached": false
}

AI Database Tools

Schema Generator

Generate complete database schemas from a natural language description:

POST /projects/{project_id}/ai/schema/generate
{
  "prompt": "Blog with users, posts, comments, and tags"
}

// Response — generated SQL
{
  "sql": "CREATE TABLE users (\n  id UUID PRIMARY KEY DEFAULT ...",
  "tables": ["users", "posts", "comments", "tags", "post_tags"],
  "explanation": "Created a blog schema with..."
}

// Apply (requires explicit approval)
POST /projects/{project_id}/ai/schema/apply
{
  "sql": "CREATE TABLE ...",
  "approved": true
}

🛡️ Schema Apply Security

  • • Requires approved: true in the request body
  • • Only allows CREATE TABLE, CREATE INDEX, and ALTER TABLE
  • • Blocks dangerous keywords: DROP, GRANT, COPY, TRUNCATE
  • • Enforces single-statement execution to prevent SQL chaining

RLS Policy Generator

POST /projects/{project_id}/ai/rls/generate
{
  "prompt": "Users can only read and update their own posts"
}

// Response
{
  "sql": "ALTER TABLE posts ENABLE ROW LEVEL SECURITY;\nCREATE POLICY ...",
  "policies": ["posts_select_own", "posts_update_own"],
  "explanation": "Created two RLS policies..."
}

// Apply
POST /projects/{project_id}/ai/rls/apply
{ "sql": "...", "approved": true }

SQL Query Builder

POST /projects/{project_id}/ai/sql/generate
{
  "prompt": "Show me all users who signed up this week",
  "execute": true
}

// Response
{
  "sql": "SELECT * FROM users WHERE created_at >= NOW() - INTERVAL '7 days'",
  "explanation": "Fetches users created in the last 7 days",
  "results": [
    { "id": "...", "name": "Rahul", "created_at": "2026-03-20T..." }
  ]
}

API Reference

MethodPathDescription
GET/ai/providersList configured providers
POST/ai/providersAdd/update provider key
DELETE/ai/providers/{provider}Remove provider key
POST/ai/providers/{provider}/testTest provider connectivity
GET/ai/modelsList available models
POST/ai/chatChat completion
POST/ai/chat/streamStreaming chat (SSE)
POST/ai/arenaCompare 2–4 models in parallel
POST/ai/embeddingsGenerate embeddings
POST/ai/vectors/setupEnable pgvector extension
POST/ai/vectors/collectionsCreate collection
GET/ai/vectors/collectionsList collections
POST/ai/vectors/collections/{c}/upsertUpsert documents (auto-embed)
POST/ai/vectors/collections/{c}/searchSemantic similarity search
POST/ai/schema/generateGenerate DB schema from prompt
POST/ai/schema/applyApply generated schema
POST/ai/rls/generateGenerate RLS policies from prompt
POST/ai/rls/applyApply generated RLS policies
POST/ai/sql/generateGenerate + optionally execute SQL
GET/ai/usageUsage analytics & logs
POST/api/ai/chatPublic chat (API key auth)
POST/zy/chatZy developer assistant

Usage Analytics

Every AI request is logged with model, provider, token counts, and latency. Query usage via API:

GET /projects/{project_id}/ai/usage
// Usage stats (last 30 days)
GET /projects/{project_id}/ai/usage?days=30

// Response
{
  "total_requests": 1250,
  "total_tokens": { "prompt": 450000, "completion": 320000 },
  "by_provider": {
    "openai": { "requests": 800, "tokens": 500000 },
    "anthropic": { "requests": 350, "tokens": 220000 },
    "google": { "requests": 100, "tokens": 50000 }
  },
  "recent_logs": [...]
}