Supavector — Memory Infrastructure for AI Agents

Product

Supavector gives AI apps a governed memory layer.

Put retrieval, memory retention, access control, and feedback loops behind one backend surface instead of spreading them across your app.

Start your project Watch a demo

Supavector is open source. The source code, installer scripts, and self-hosted setup files are available in the GitHub repository.

Self-hosted Tenant-aware Policy-driven recall

Runtime loop

One backend surface for memory.

Keep your app in front and move ingest, recall, governance, and improvement behind a single platform.

Ingest Docs and memory writes

Index docs, URLs, and runtime memories.

Recall Search and ask

Retrieve through one policy-aware path.

Govern Tenant and access rules

Apply tenancy, visibility, and auth controls.

Improve Feedback and lifecycle

Keep memory quality moving over time.

Customer support copilots Internal knowledge agents Workflow assistants Research copilots Operations agents Specs and documentation systems

Hosted API

Get a token, add credit, start calling.

No servers to run. Sign up, create a project in the Dashboard, add credit, and you're calling the API in minutes.

Prefer zero cost and full control? Supavector is fully open source — run it yourself, free forever ↓

Sign up & create a project

Sign in with Google, GitHub, or email. Click Dashboard → New Project. Copy the token shown — it's only displayed once.

supav_••••••••••••••••••••••••••••••••

Add credit for AI generation

Dashboard → + Add Credit. Choose an amount, pay via Stripe. Indexing and search are free — only /ask and /boolean_ask deduct credit.

Indexing — free Search — free Ask — uses credit

Wire up your environment

Set two env vars in your app, agent, or script. That's the entire setup — no Docker, no Postgres, no config files.

SUPAVECTOR_BASE_URL=https://your-host
SUPAVECTOR_API_KEY=supav_your_token

API examples

Index a document

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "Authorization: Bearer ${SUPAVECTOR_API_KEY}" \
  -H "Idempotency-Key: doc-001" \
  -H "Content-Type: application/json" \
  -d '{
    "docId":      "welcome",
    "collection": "default",
    "text":       "Supavector stores memory for agents."
  }'

Ask a question uses credit

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "Authorization: Bearer ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What does Supavector store?",
    "k":        5,
    "policy":   "amvl"
  }'

Index a document

import os, requests

BASE = os.environ["SUPAVECTOR_BASE_URL"]
KEY  = os.environ["SUPAVECTOR_API_KEY"]
HDR  = {
    "Authorization": f"Bearer {KEY}",
    "Content-Type":  "application/json",
}

requests.post(
    f"{BASE}/v1/docs",
    headers={**HDR, "Idempotency-Key": "doc-001"},
    json={
        "docId":      "welcome",
        "collection": "default",
        "text":       "Supavector stores memory for agents.",
    },
).raise_for_status()

Ask a question uses credit

r = requests.post(
    f"{BASE}/v1/ask",
    headers=HDR,
    json={
        "question": "What does Supavector store?",
        "k":        5,
        "policy":   "amvl",
    },
)
r.raise_for_status()
print(r.json()["data"]["answer"])

Index a document

const BASE = process.env.SUPAVECTOR_BASE_URL;
const KEY  = process.env.SUPAVECTOR_API_KEY;

async function ar(path, body, extra = {}) {
  const res = await fetch(`${BASE}${path}`, {
    method:  "POST",
    headers: {
      "Authorization": `Bearer ${KEY}`,
      "Content-Type":  "application/json",
      ...extra,
    },
    body: JSON.stringify(body),
  });
  if (!res.ok) throw new Error(await res.text());
  return res.json();
}

await ar("/v1/docs",
  { docId: "welcome", collection: "default",
    text: "Supavector stores memory for agents." },
  { "Idempotency-Key": "doc-001" }
);

Ask a question uses credit

const { data } = await ar("/v1/ask", {
  question: "What does Supavector store?",
  k:        5,
  policy:   "amvl",
});

console.log(data.answer);

Index a document

const BASE = "https://your-host";
const KEY  = "supav_your_token"; // keep server-side in prod

await fetch(`${BASE}/v1/docs`, {
  method:  "POST",
  headers: {
    "Authorization":  `Bearer ${KEY}`,
    "Content-Type":   "application/json",
    "Idempotency-Key": "doc-001",
  },
  body: JSON.stringify({
    docId:      "welcome",
    collection: "default",
    text:       "Supavector stores memory for agents.",
  }),
});

Ask a question uses credit

const res = await fetch(`${BASE}/v1/ask`, {
  method:  "POST",
  headers: {
    "Authorization": `Bearer ${KEY}`,
    "Content-Type":  "application/json",
  },
  body: JSON.stringify({
    question: "What does Supavector store?",
    k:        5,
    policy:   "amvl",
  }),
});

const { data } = await res.json();
console.log(data.answer);

If AI generation is blocked

402 CREDIT_REQUIRED — balance is zero. Add credit from the Dashboard.

503 CREDIT_CHECK_FAILED — transient error. Retry with backoff.

— Self-hosted tokens (no supav_ prefix) bypass the credit system entirely.

Self-Host And Build On Top

Up and running in minutes.

Install the CLI, start the stack, and write your first memory in under five minutes. Full control — your infra, your data.

Clone and install the CLI

Clone the repo, then run the installer. Works on macOS, Linux, and Windows.

git clone https://github.com/Emmanuel-Bamidele/supavector.git
cd supavector && ./scripts/install.sh

Start the stack

One command brings up the vector core, gateway, and Postgres locally.

supavector start

Write your first memory

Use the CLI or call the API directly with your service token.

supavector write --text "Hello, memory"

Ask a question

Query the memory you just wrote. Supavector retrieves and answers grounded in your data.

supavector ask --question "What did I store?"

Add to your app env

Export the base URL and service token so your backend can reach Supavector.

export SUPAVECTOR_BASE_URL="http://localhost:3000"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"

Call server-to-server

Your backend calls Supavector directly. No browser, no end-user login needed.

curl -X POST "$SUPAVECTOR_BASE_URL/v1/ask" \
  -H "X-API-Key: $SUPAVECTOR_API_KEY" \
  -d '{"question":"What did I store?","k":3}'

What shifts

Move memory out of app glue and into infrastructure.

Teams stop stitching together retrieval logic, ACL rules, expiry behavior, and quality loops in five different places. Supavector centralizes the memory layer so product code can stay focused on the actual experience.

Keep your frontend, backend, and agent orchestration where they already belong.
Use one governed path for writes, search, ask, and memory recall.
Push setup, auth, API, and policy details into Documentation instead of bloating this tab.

Best fit

Customer assistants, internal copilots, research agents, and controlled automation where memory quality matters after day one.

Built in

Tenant scoping, service-token auth, retrieval policies, and lifecycle controls are already part of the platform surface.

Open Source

Built in the open. MIT licensed.

Supavector is fully open source. Every line — the C++ vector core, Node.js gateway, installer scripts, and Docker Compose files — is available to inspect, fork, and self-host. There is no hosted service, no vendor lock-in, and no black-box dependencies.

MIT License Self-hostable No vendor lock-in No telemetry by default

View on GitHub

Fork it, deploy it, or contribute. The whole platform is yours to run.

How it fits

Keep your app. Add Supavector behind it.

Supavector becomes the memory layer behind your existing product instead of replacing your product with a new one.

Connect sources

Index docs and URLs, or write memory directly from your app, workers, and automations.

Retrieve with policy

Search, ask, and recall through one platform while choosing TTL, LRU, or AMVL for the workload.

Govern and improve

Apply tenant controls, enforce access rules, and keep memory quality moving in the right direction over time.

Explore further

See setup, API details, and the open source code.

Documentation covers setup, auth, service tokens, API endpoints, memory policies, and integration examples. If you want to inspect or fork Supavector itself, the source code is in the GitHub repository.

Setup paths Auth + service tokens API reference Memory policies

Playground

Ingest content, then search or ask with the same context — all in one place.

Ingest

Paste text, upload a file, or index a link.

Collection

Collection is your content bucket/namespace. Use default if you do not need separation. Valid names use only letters, numbers, ., -, and _ (example: team_docs, v2.alpha). Invalid: spaces or characters like /, #, ?.

Doc ID

Use letters, numbers, dot, dash, or underscore (no spaces).

Indexing pipeline

We split the text into chunks, embed each chunk, store vectors in the C++ TCP service, and store chunk text in Postgres for previews and citations.

Document Text

You can also load a local file or index directly from a URL.

Upload file (text, PDF, Word .docx)

Text files are read directly. PDF and .docx are extracted in your browser into Document Text.

Index from URL

If a URL is provided, the server fetches and indexes it.

Raw response (debug)

(no output)

Search

Semantic search across your indexed documents.

Query

Top K

Higher K returns more chunks (slower + more context).

Collection scope

Scope retrieval to one collection, or keep All collections.

Mode

Retrieval policy mode. AMV-L is the default.

Raw response (debug)

(no output)

Ask

RAG answer grounded in your indexed sources.

Question

Top K sources

We retrieve top-K chunks, then ask the model to answer using them.

Collection scope

Scope answer retrieval to one collection, or keep All collections.

Answer length

Choose response depth. Auto adapts to available source detail.

Mode

Answer retrieval policy mode. AMV-L is the default.

Provider override

Optional per-request provider override for ask or boolean_ask.

Model override

Optional per-request model override. The list updates when you change provider, and Supavector also exposes the live preset catalog at GET /v1/models.

Use True / False only when the response must be exactly true, false, or invalid.

Raw response (debug)

(no output)

Backend metrics

Health, storage, and vector index stats for operators.

Updated: -

Stats JSON

(no output)

Organization usage

Admin view of requests, tokens, storage, and latency for your tenant.

Updated: -

Requires admin privileges.

Per-route usage

(no data)

Usage JSON

(no output)

Job ID (optional)

Leave empty to load all in‑progress jobs for your tenant.

Actions

Job details

(no job loaded)

Recent jobs

(no data)

Collections

All collections for your tenant, with document titles and delete controls.

Updated: -

Collection list

(no data)

Deleting a collection removes stored chunk text and memory items. Vector deletion is not yet supported.

Documentation

Pick the right tab for where you are.

Each tab is self-contained. You do not need to read them in order.

Overview

Auth model, API reference, error codes, security controls, SDK setup, and request examples with curl, Python, and Node.js.

Read this when you need API or security details.

CLI Setup

Install the CLI, onboard the local Docker stack, connect your app, and use the full command reference.

Start here if you are setting up for the first time.

Setup Modes

Self-hosted, external Postgres, shared deployment, backend-as-caller, and human admin paths with credential rules.

Read this when choosing how to deploy.

Memory Policies

TTL, LRU, and AMVL retention policies explained with trade-offs, use cases, and API request examples.

Read this when choosing a memory retention policy.

Quickstart

Recommended quickstart: use the CLI. On the self-hosted path, supavector onboard handles the first local bootstrap for you, including the first admin and the first service token. You do not need to run a separate human bootstrap command first unless you intentionally want the raw manual Docker path.

Recommended path: CLI

./scripts/install.sh
supavector onboard
supavector write --doc-id welcome --collection local-demo --text "Supavector stores memory for agents."
supavector ask --question "What does Supavector store?" --collection local-demo
supavector boolean_ask --question "Does Supavector store memory for agents?" --collection local-demo

Use the CLI Setup tab if you want the full install matrix, Windows command, one-line remote install, and the full CLI command reference.

What supavector onboard does

Prompts for admin username, admin password, tenant id, provider/model defaults, and whichever provider API keys are needed for those defaults.
Writes the local env file and stages the local CLI config.
Starts the Supavector Docker stack and runs the bootstrap helper.
Creates the first admin user and the first service token.
Updates ~/.supavector/config.json with the saved base URL and service token for later CLI use.

How the first service token is obtained

On the recommended CLI path, the first service token is created automatically during supavector onboard. The CLI runs the same bootstrap logic for you and stores the returned token locally.

If you self-host manually instead of using the CLI, the same first token comes from scripts/bootstrap_instance.js. If someone else runs Supavector for you, ask that admin for SUPAVECTOR_BASE_URL and a tenant-scoped service token.

Open source repository

Supavector is open source. The source code, installer scripts, Docker files, and docs source are available in the GitHub repository.

Use Supavector on this same computer

You do not need to copy the token anywhere just to use the CLI locally. After onboarding, the CLI reads the saved config and sends the saved service token automatically.

supavector write --doc-id welcome --collection local-demo --text "Supavector stores memory for agents."
supavector search --q "memory for agents" --collection local-demo --k 5
supavector ask --question "What does Supavector store?" --collection local-demo
supavector boolean_ask --question "Does Supavector store memory for agents?" --collection local-demo

Use the same collection name on both write and read operations when you want tighter control over what the CLI searches. Later, refresh the installed CLI checkout with supavector update.

Ingest a folder and use the folder name as the collection

If you point the CLI at a folder, it ingests supported files inside it. Plain text files are read directly, while .pdf and .docx files are extracted to text locally before indexing. By default, the folder name becomes the collection name.

supavector write --folder ./customer-support
supavector search --q "refund policy" --collection customer-support --k 5
supavector ask --question "What is the refund policy?" --collection customer-support
supavector boolean_ask --question "Does the refund policy mention store credit?" --collection customer-support

You can override the derived collection with --collection. Unsupported files are skipped. Legacy .doc files still need conversion to .docx first.

Use Supavector in your app or script on this machine

If you are building your own app, backend, worker, or agent, export the saved base URL and service token into your runtime environment.

export SUPAVECTOR_BASE_URL="http://localhost:3000"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"

The token is shown during onboarding. If you need to inspect the saved local values again on your own machine, run supavector config show --show-secrets and copy the values into your app env or secret manager. Do not commit those values to git.

First app request example

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Idempotency-Key: idx-001" \
  -H "Content-Type: application/json" \
  -d '{
    "docId": "welcome",
    "collection": "default",
    "text": "Supavector stores memory for agents."
  }'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"What does Supavector store?","k":3}'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/boolean_ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"Does Supavector store memory for agents?","k":3}'

Key concepts

Human admin: a person who can sign in to Supavector with username/password or SSO and manage the tenant.
Service token: the machine credential used by apps, backends, workers, and agents. In env vars, this is usually SUPAVECTOR_API_KEY.
Self-hosted: you run Supavector yourself and create the first admin and token yourself, either through the CLI or the manual setup path.
Shared deployment: Supavector is already running somewhere else. An existing admin gives you a base URL plus a token, or signs in and creates one for you.

Repository guides for fork users

docs/self-hosting.md for the full self-hosting path
docs/bring-your-own-postgres.md for external Postgres
docs/agents.md for app and agent runtime integration
docker-compose.external-postgres.yml and .env.external-postgres.example for the built-in BYO Postgres path

Per-request provider key

X-API-Key: YOUR_SERVICE_TOKEN
X-OpenAI-API-Key: YOUR_OPENAI_KEY
X-Gemini-API-Key: YOUR_GEMINI_KEY
X-Anthropic-API-Key: YOUR_ANTHROPIC_KEY

Use this when you want Supavector to keep its own Postgres/auth/runtime but use your provider key for a specific request. This is request-scoped; Supavector does not persist the key for the tenant.

Looking for manual Docker setup?

Open the Setup tab. That tab has the bundled-stack, BYO Postgres, shared-deployment, backend-as-caller, and human-admin paths, including the raw bootstrap_instance.js commands.

Authentication

Recommended auth model: JWTs for human admins and browser sessions, service tokens for apps, agents, workers, CI, and server-to-server integrations.

Protected endpoints require either a JWT or a service token. Send Authorization: Bearer <jwt> or X-API-Key: <token> (also supports Authorization: ApiKey <token>).

Multi-tenant isolation is enforced by the auth token. JWTs must include a tenant identifier using one of these claims: tenant, tid, or sub. Service tokens inherit the tenant from the admin who created them.

Admins can issue service tokens via POST /v1/admin/service-tokens. The bootstrap helper also creates the first token. Store returned tokens securely; they are shown only once.

If your application already has its own user auth, keep it there. Let your backend call Supavector with a service token instead of making every end user or agent log in directly to Supavector.

Optional third mode: keep Supavector auth the same, but send the matching provider-key header (X-OpenAI-API-Key, X-Gemini-API-Key, or X-Anthropic-API-Key) on supported sync requests when you want your own model-provider key to be used instead of the server default.

Create API key (admin)

curl -X POST http://localhost:3000/v1/admin/service-tokens \
  -H "Authorization: Bearer YOUR_ADMIN_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "ci-pipeline",
    "principalId": "svc:ci",
    "roles": ["indexer"],
    "expiresAt": "2026-12-31T00:00:00Z"
  }'

Requires role admin (or a token whose principal matches the tenant).

Tenant settings

Admins can set per-tenant login policy: sso_only, sso_plus_password, or password_only. sso_only disables password login; password_only disables SSO.

You can also restrict which SSO providers are allowed by supplying ssoProviders (google, azure, okta).

The same admin endpoint also stores tenant-level generation defaults for ask, boolean_ask, reflection, and compaction under models. The embedding model is instance-wide and is changed in the self-hosted env or with supavector changemodel.

Get tenant settings (admin)

curl -X GET http://localhost:3000/v1/admin/tenant \
  -H "Authorization: Bearer YOUR_ADMIN_JWT"

Set auth + model defaults (admin)

curl -X PATCH http://localhost:3000/v1/admin/tenant \
  -H "Authorization: Bearer YOUR_ADMIN_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "authMode": "sso_only",
    "ssoProviders": ["google"],
    "models": {
      "answerProvider": "openai",
      "answerModel": "gpt-4.1",
      "booleanAskProvider": null,
      "booleanAskModel": null,
      "reflectProvider": "openai",
      "reflectModel": "gpt-4o-mini",
      "compactProvider": null,
      "compactModel": null
    }
  }'

Valid authMode values: sso_only, sso_plus_password, password_only. Set a provider/model field to null to clear the tenant override and fall back to the instance default. embedProvider and embedModel remain instance-wide.

Security

Built-in controls

Auth required: JWTs or service tokens protect all sensitive endpoints.
Tenant isolation: data is partitioned by tenant and enforced in every query.
Role-based access: reader/indexer/admin control access to privileged actions.
Tenant auth mode: enforce sso_only or password_only per tenant.
Tenant SSO allowlist: restrict SSO to google, azure, or okta.
Visibility + ACL: tenant, private, acl rules apply to search/recall.
Idempotency: write/index/reflect accept Idempotency-Key to prevent double writes.
Audit logs: sensitive actions (auth policy changes, key revokes, deletes) are recorded with actor + request metadata.
Rate limits + lockout: login throttling + failed-login lockout are enabled.
URL ingestion safety: private IP ranges are blocked (SSRF protection).
Prompt injection guard: source sanitization is enabled by default.

RBAC by endpoint

Endpoint	Role required
GET /v1/health	Public
POST /v1/login	Public
/v1/auth/*	Public (SSO)
GET /v1/stats	Reader+
GET /v1/metrics	Reader+
GET /v1/docs	Reader+
GET /v1/collections	Reader+
GET /v1/search	Reader+
POST /v1/ask	Reader+
POST /v1/boolean_ask	Reader+
POST /v1/memory/recall	Reader+
POST /v1/feedback	Reader+
POST /v1/memory/event	Reader+
POST /v1/docs	Indexer+
POST /v1/docs/url	Indexer+
DELETE /v1/docs/:docId	Indexer+
POST /v1/memory/write	Indexer+
POST /v1/memory/reflect	Indexer+
GET /v1/jobs	Reader+
GET /v1/jobs/:id	Reader+
POST /v1/memory/cleanup	Admin
POST /v1/memory/compact	Admin
DELETE /v1/collections/:collection	Admin
/v1/admin/service-tokens	Admin
/v1/admin/tenant	Admin
/v1/admin/usage	Admin

Reader+ means reader, indexer, or admin. Indexer+ means indexer or admin.

API Reference

Full OpenAPI schema is available at /openapi.json, with Swagger UI at /docs. All versioned routes live under /v1; legacy routes remain as aliases.

Public ingest APIs expect docId + text JSON payloads; the Playground UI and CLI can extract supported files like .pdf and .docx to text locally first. Use POST /v1/docs/url for URL-based ingestion.

Request-scoped provider-key headers such as X-OpenAI-API-Key, X-Gemini-API-Key, and X-Anthropic-API-Key are supported on sync routes such as docs, search, ask, boolean_ask, memory write, and memory recall. They are currently rejected by async /v1/memory/reflect and /v1/memory/compact.

/v1/docs index + list documents
/v1/docs/url index content from a URL
/v1/docs/:docId delete a document
/v1/search retrieve top‑K chunks
/v1/ask RAG answers with citations and controllable response length
/v1/boolean_ask grounded responses constrained to true, false, or invalid
/v1/memory/write durable memory items
/v1/memory/recall filtered recall
/v1/memory/reflect async reflection jobs
/v1/feedback user feedback signals
/v1/memory/event task outcome signals
/v1/metrics Prometheus metrics (per tenant)
/v1/jobs/:id job status
/v1/admin/tenant tenant auth + generation model settings (admin)
/v1/admin/service-tokens issue API keys (admin)

POST /v1/ask request fields

Field	Type	Required	Notes
question	string	Yes	User question to answer from retrieved sources.
k	integer	No	Top-K chunks retrieved before answer generation (default: 5).
docIds	string[]	No	Optional doc filter; only these doc IDs are searched.
collection	string	No	Limit retrieval to one collection. Use collectionScope=all to search all collections.
policy	enum	No	Memory/retrieval policy: amvl (default), ttl, or lru.
answerLength	enum	No	auto (default), short, medium, long.
model	string	No	Optional per-request generation override. If omitted, Supavector uses the tenant default or instance default.

auto adapts answer size to available evidence. The response includes data.answerLength with the effective mode used.

POST /v1/boolean_ask request fields

Field	Type	Required	Notes
question	string	Yes	Ask a clear true/false question when you expect a grounded binary response.
k	integer	No	Top-K chunks retrieved before classification (default: 5).
docIds	string[]	No	Optional doc filter; only these doc IDs are searched.
collection	string	No	Limit retrieval to one collection. Use collectionScope=all to search all collections.
policy	enum	No	Memory/retrieval policy: amvl (default), ttl, or lru.
model	string	No	Optional per-request generation override for the true/false/invalid classifier.

The response data.answer is always one of true, false, or invalid. The response also includes data.supportingChunks with the chunk text used for the decision.

Response format (v1): all /v1 endpoints return a consistent envelope.

Success

{
  "ok": true,
  "data": { ... },
  "meta": {
    "tenantId": "acme",
    "collection": "default",
    "timestamp": "2026-02-12T12:00:00.000Z"
  }
}

Error

{
  "ok": false,
  "error": { "message": "Invalid input", "code": "INVALID_INPUT" },
  "meta": { "tenantId": "acme", "collection": "default", "timestamp": "..." }
}

Error codes (common)

Code	Meaning / cause
AUTH_REQUIRED	Missing or malformed auth header.
AUTH_INVALID	Invalid JWT/API key or invalid login credentials.
AUTH_EXPIRED	API key expired.
AUTH_REVOKED	API key revoked.
AUTH_CONFIG	Server auth misconfiguration.
AUTH_LOOKUP_FAILED	Auth lookup failed in the database.
RATE_LIMITED	Too many requests in the current window.
FORBIDDEN	Insufficient role/permissions for the endpoint.
NOT_FOUND	Requested resource does not exist.
INVALID_INPUT	Missing or invalid parameters in the request.
INVALID_DOC_ID	Doc ID failed validation (format/characters).
ACCOUNT_LOCKED	Account locked after repeated failed logins.
ACCOUNT_DISABLED	User account disabled by admin.
SSO_ONLY	Account requires SSO login.
IDEMPOTENCY_KEY_REQUIRED	Missing Idempotency-Key header.
IDEMPOTENCY_KEY_INVALID	Invalid or oversized idempotency key.
IDEMPOTENCY_KEY_REUSED	Idempotency key reused with different payload.
IDEMPOTENCY_IN_PROGRESS	Request with the same key is already running.

Some endpoints also return operation-specific codes like *_FAILED for internal failures.

AI assistant connections

Connect Supavector documentation to your assistants so they can fetch current integration guidance directly from your docs stack. This includes llms.txt, a built-in MCP endpoint, and quick access patterns for ChatGPT and Claude.

Runtime auth for agents

Use a service token in your runtime agent environment. Let humans use JWTs in the browser UI. If your app already has auth, keep that layer and have the backend call Supavector with SUPAVECTOR_BASE_URL and SUPAVECTOR_API_KEY.

Instruction for local coding agents

If an assistant is operating on a local computer that already has this repo, it should treat the Supavector CLI as the default local control surface.

1. Run supavector doctor
2. If Supavector is not configured locally, run supavector onboard
3. Prefer supavector write/search/ask/boolean_ask over raw bootstrap commands
4. Use --collection explicitly when scope matters
5. For folder ingest, supavector write --folder ./name uses the folder name as the default collection
6. Do not ask for token re-entry when local CLI config already exists

The same instruction is also exposed through llms.txt and a repo-level AGENTS.md so local assistants have a machine-readable path to follow.

Quick access URLs

Documentation page: (loading)
API reference: (loading)
llms.txt: (loading)
MCP server endpoint: (loading)

Use our MCP server

The Supavector docs MCP server is available at (loading).

Once connected, your assistant can search Supavector documentation in real time for API usage, authentication setup, memory policy modes (amvl, ttl, lru), lifecycle behavior, and implementation patterns.

Connect with Claude Code

(loading command)

Project (local) scope: adds the MCP server only for the current working directory.

(loading command)

Connect with Claude Desktop

Open Claude Desktop.
Go to Settings > Connectors.
Add this MCP server URL: (loading).

Connect with Codex CLI

(loading command)

Connect with Cursor

(loading config)

Connect with VS Code

(loading config)

Connect with Antigravity

(loading config)

Quick prompt for ChatGPT and Claude

(loading prompt)

SDKs

Official SDK: sdk/node (Node.js). It supports JWT or API key auth, idempotency headers, and memory APIs. For apps and agents, the recommended input is SUPAVECTOR_BASE_URL plus SUPAVECTOR_API_KEY. Additional SDKs can be added using the OpenAPI schema.

Examples

Examples below assume you already ran bootstrap and have a service token.

cURL:

Playground upload (UI)

In the Playground Ingest tab, choose a local file to auto-fill Document Text. Supported: .txt, .md, .json, .csv, .log, .pdf, .docx. Legacy .doc is not supported.

Index text

curl -X POST http://localhost:3000/v1/docs \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Idempotency-Key: idx-001" \
  -H "Content-Type: application/json" \
  -d '{
    "docId": "swe_notes",
    "collection": "default",
    "text": "Your document text..."
  }'

curl "http://localhost:3000/v1/search?q=write-ahead%20logging&k=5&collection=default" \
  -H "X-API-Key: YOUR_SERVICE_TOKEN"

Ask

curl -X POST http://localhost:3000/v1/ask \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "question": "How is durability handled?", "k": 5, "answerLength": "medium", "policy": "amvl" }'

Boolean ask

curl -X POST http://localhost:3000/v1/boolean_ask \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "question": "Is write-ahead logging enabled?", "k": 5, "policy": "amvl" }'

Delete a document

curl -X DELETE "http://localhost:3000/v1/docs/swe_notes?collection=default" \
  -H "X-API-Key: YOUR_SERVICE_TOKEN"

Memory write + recall

curl -X POST http://localhost:3000/v1/memory/write \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Idempotency-Key: mem-001" \
  -H "Content-Type: application/json" \
  -d '{ "text": "Release shipped on Friday", "type": "semantic", "policy": "amvl", "collection": "default", "agentId": "agent:release-bot", "tags": ["release", "ops"], "importanceHint": 0.6, "pinned": false }'

curl -X POST http://localhost:3000/v1/memory/recall \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "query": "release day", "policy": "amvl", "types": ["semantic"], "tags": ["release"], "agentId": "agent:release-bot", "k": 5 }'

Memory feedback

curl -X POST http://localhost:3000/v1/feedback \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "memoryId": "mem_123", "feedback": "positive", "eventValue": 0.8 }'

Task outcome event

curl -X POST http://localhost:3000/v1/memory/event \
  -H "X-API-Key: YOUR_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "memoryId": "mem_123", "eventType": "task_success", "eventValue": 0.9 }'

Recall filters support types, time range (since/until), tags, agentId, and collection.

Supported memory types: artifact, semantic, procedural, episodic, conversation, summary.

Optional memory policy values are amvl, ttl, and lru. If omitted, Supavector uses amvl.

Memory types

Type	Use case
artifact	Source documents or raw inputs.
semantic	Facts or knowledge extracted from artifacts.
procedural	How-to steps or workflows.
episodic	Events with time/context.
conversation	Dialogue snippets or chat history.
summary	Compressed rollups or compactions.

Client examples (Python + Node.js)

Install + env

pip install requests
export SUPAVECTOR_BASE_URL="http://localhost:3000"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"

End-to-end flow

import os
import time
import requests

BASE = os.getenv("SUPAVECTOR_BASE_URL") or os.getenv("SUPAVECTOR_URL", "http://localhost:3000")
API_KEY = os.getenv("SUPAVECTOR_API_KEY")

headers = {
  "Content-Type": "application/json",
  "X-API-Key": API_KEY
}

# 1) Index
doc = {
  "docId": "swe_notes",
  "collection": "default",
  "text": "WAL keeps vectors durable across restarts."
}
res = requests.post(f"{BASE}/v1/docs", headers={**headers, "Idempotency-Key": "idx-001"}, json=doc)
res.raise_for_status()
print(res.json())

# 2) Search
res = requests.get(f"{BASE}/v1/search", headers=headers, params={
  "q": "durability",
  "k": 5,
  "collection": "default"
})
res.raise_for_status()
print(res.json())

# 3) Ask
res = requests.post(f"{BASE}/v1/ask", headers=headers, json={
  "question": "How do we persist vectors?",
  "k": 5,
  "answerLength": "short",
  "policy": "amvl"
})
res.raise_for_status()
print(res.json())

# 4) Boolean ask
res = requests.post(f"{BASE}/v1/boolean_ask", headers=headers, json={
  "question": "Does WAL keep vectors durable across restarts?",
  "k": 5,
  "policy": "amvl"
})
res.raise_for_status()
print(res.json())

# 5) Memory write
res = requests.post(f"{BASE}/v1/memory/write", headers={**headers, "Idempotency-Key": "mem-001"}, json={
  "type": "semantic",
  "policy": "amvl",
  "collection": "default",
  "text": "Vector WAL is enabled in production.",
  "agentId": "agent:ops-bot",
  "tags": ["infra", "wal"],
  "importanceHint": 0.6,
  "pinned": false
})
res.raise_for_status()

# 6) Recall
res = requests.post(f"{BASE}/v1/memory/recall", headers=headers, json={
  "query": "WAL enabled",
  "policy": "amvl",
  "types": ["semantic"],
  "tags": ["infra"],
  "k": 5
})
res.raise_for_status()
print(res.json())

# 5b) Feedback
res = requests.post(f"{BASE}/v1/feedback", headers=headers, json={
  "memoryId": "mem_123",
  "feedback": "positive",
  "eventValue": 0.8
})
res.raise_for_status()

# 6) Reflect (async job)
res = requests.post(f"{BASE}/v1/memory/reflect", headers={**headers, "Idempotency-Key": "reflect-001"}, json={
  "docId": "swe_notes",
  "policy": "amvl",
  "types": ["semantic", "summary"],
  "collection": "default"
})
res.raise_for_status()
job_id = res.json()["data"]["job"]["id"]
#
# You can also reflect from a conversation memory item via "conversationId".

# 7) Poll job
while True:
  job = requests.get(f"{BASE}/v1/jobs/{job_id}", headers=headers).json()
  status = job["data"]["job"]["status"]
  if status in ("succeeded", "failed"):
    print(job)
    break
  time.sleep(2)

Collections + ACL visibility

# Restrict a document to an ACL list (inside the tenant)
res = requests.post(f"{BASE}/v1/docs", headers={**headers, "Idempotency-Key": "idx-002"}, json={
  "docId": "private_notes",
  "collection": "finance",
  "text": "Confidential budget details...",
  "visibility": "acl",
  "acl": ["user:alice", "user:bob"]
})

Server-side privileges (no end-user login)

# Requires ALLOW_PRINCIPAL_OVERRIDE=1 and an admin service token
res = requests.post(f"{BASE}/v1/memory/recall", headers=headers, json={
  "query": "policy details",
  "policy": "amvl",
  "collection": "internal",
  "principalId": "user:alice",
  "privileges": ["role:employee", "dept:hr"],
  "types": ["semantic"],
  "k": 5
})

Install + env (Node 18+)

export SUPAVECTOR_BASE_URL="http://localhost:3000"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"
node app.mjs

End-to-end flow

const BASE = process.env.SUPAVECTOR_BASE_URL || process.env.SUPAVECTOR_URL || "http://localhost:3000";
const API_KEY = process.env.SUPAVECTOR_API_KEY;
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
const headers = {
  "Content-Type": "application/json",
  "X-API-Key": API_KEY
};

const post = async (path, body, extra = {}) => {
  const res = await fetch(`${BASE}${path}`, {
    method: "POST",
    headers: { ...headers, ...extra },
    body: JSON.stringify(body)
  });
  const data = await res.json();
  if (!res.ok) throw new Error(JSON.stringify(data));
  return data;
};

const get = async (path, params = {}) => {
  const url = new URL(`${BASE}${path}`);
  Object.entries(params).forEach(([k, v]) => url.searchParams.set(k, v));
  const res = await fetch(url, { headers });
  const data = await res.json();
  if (!res.ok) throw new Error(JSON.stringify(data));
  return data;
};

// 1) Index
await post("/v1/docs", {
  docId: "swe_notes",
  collection: "default",
  text: "WAL keeps vectors durable across restarts."
}, { "Idempotency-Key": "idx-001" });

// 2) Search
console.log(await get("/v1/search", { q: "durability", k: "5", collection: "default" }));

// 3) Ask
console.log(await post("/v1/ask", { question: "How do we persist vectors?", k: 5, answerLength: "long", policy: "amvl" }));

// 4) Boolean ask
console.log(await post("/v1/boolean_ask", { question: "Do we persist vectors with WAL?", k: 5, policy: "amvl" }));

// 5) Memory write
await post("/v1/memory/write", {
  type: "semantic",
  policy: "amvl",
  collection: "default",
  text: "Vector WAL is enabled in production.",
  agentId: "agent:ops-bot",
  tags: ["infra", "wal"],
  importanceHint: 0.6,
  pinned: false
}, { "Idempotency-Key": "mem-001" });

// 6) Recall
console.log(await post("/v1/memory/recall", { query: "WAL enabled", policy: "amvl", types: ["semantic"], tags: ["infra"], k: 5 }));

// 6b) Feedback
await post("/v1/feedback", { memoryId: "mem_123", feedback: "positive", eventValue: 0.8 });

// 7) Reflect + job polling
const reflect = await post("/v1/memory/reflect", {
  docId: "swe_notes",
  policy: "amvl",
  types: ["semantic", "summary"],
  collection: "default"
}, { "Idempotency-Key": "reflect-001" });
// You can also pass conversationId to reflect from a conversation memory item.

const jobId = reflect.data.job.id;
while (true) {
  const job = await get(`/v1/jobs/${jobId}`);
  const status = job.data.job.status;
  if (status === "succeeded" || status === "failed") {
    console.log(job);
    break;
  }
  await sleep(2000);
}

Collections + ACL visibility

await post("/v1/docs", {
  docId: "private_notes",
  collection: "finance",
  text: "Confidential budget details...",
  visibility: "acl",
  acl: ["user:alice", "user:bob"]
}, { "Idempotency-Key": "idx-002" });

Server-side privileges (no end-user login)

await post("/v1/memory/recall", {
  query: "policy details",
  policy: "amvl",
  collection: "internal",
  principalId: "user:alice",
  privileges: ["role:employee", "dept:hr"],
  types: ["semantic"],
  k: 5
});

Visibility can be tenant, private, or acl. For ACL, include a list of allowed principals.

Visibility without end-user login

If your app does not want users to log in directly, you can still enforce visibility by having your backend call Supavector with a service token and pass principalId and/or privileges in the payload. The server matches these against the item's visibility and ACL list. Enable this by setting ALLOW_PRINCIPAL_OVERRIDE=1 and using an admin service token. Never expose this token to the browser.

This is the recommended pattern when your product already has its own user auth. End users and runtime agents do not need separate Supavector logins if your backend is the caller of record.

Write with ACL (server-side principal)

curl -X POST http://localhost:3000/v1/docs \
  -H "X-API-Key: ADMIN_SERVICE_TOKEN" \
  -H "Idempotency-Key: idx-003" \
  -H "Content-Type: application/json" \
  -d '{
    "docId": "hr_policy",
    "collection": "internal",
    "text": "Confidential HR policy...",
    "principalId": "user:alice",
    "privileges": ["role:employee", "dept:hr"],
    "visibility": "acl",
    "acl": ["user:alice", "dept:hr"]
  }'

Recall as a principal (server-side)

curl -X POST http://localhost:3000/v1/memory/recall \
  -H "X-API-Key: ADMIN_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "policy details",
    "policy": "amvl",
    "collection": "internal",
    "principalId": "user:alice",
    "privileges": ["role:employee", "dept:hr"],
    "types": ["semantic"],
    "k": 5
  }'

If you use JWTs, principalId is derived from the token and should not be provided in the payload. privileges is only honored for admin service tokens when ALLOW_PRINCIPAL_OVERRIDE=1.

Architecture

Gateway (Node) handles auth, routing, and orchestration.
Vector store (C++ TCP service) stores embeddings and serves similarity search.
Postgres stores chunks, memory items, links, jobs, and idempotency keys.
Provider adapters handle generation for OpenAI, Gemini, and Anthropic. Embeddings are currently supported through OpenAI and Gemini.
Background jobs handle reflection, summarization, redundancy scoring, and lifecycle tasks.
Expired items are automatically swept and removed (vectors + DB rows), preventing orphan vectors.
Jobs retry with exponential backoff before transitioning to failed.
Job reruns are idempotent (derived memories are replaced, not duplicated).
Structured logs include request_id, tenant_id, and collection.
Prometheus metrics are exposed at /metrics (scoped to the tenant; admin sees all tenants).

Install

The Supavector CLI is the fastest self-hosted path in this repo. It installs a local supavector command, guides the first bootstrap, saves runtime config, and keeps installation and upgrade steps in one place.

What the CLI covers

Install: adds a local supavector wrapper for the current checkout or a cloned checkout.
Onboard: prompts for admin username, admin password, tenant id, provider/model defaults, and whichever provider API keys are needed for the chosen providers.
Operate: controls the bundled Docker stack after install.
Diagnose: shows health, compose status, and local configuration.
Use the API: lets you run write, search, and ask directly from the terminal.

Prerequisites

Node.js 18+ on the host machine
Docker with the Compose plugin for self-hosted use
git if you want the installer to clone or refresh the repo for you
At least one provider API key for normal retrieval and answer quality. OpenAI remains the default quickstart path.

Install from the current checkout

./scripts/install.sh

powershell -ExecutionPolicy Bypass -File .\scripts\install.ps1

One-line remote install

curl -fsSL https://raw.githubusercontent.com/Emmanuel-Bamidele/supavector/main/scripts/install.sh | bash

irm https://raw.githubusercontent.com/Emmanuel-Bamidele/supavector/main/scripts/install.ps1 | iex

Those commands install the CLI from the current main branch. If you want to inspect the installer first, use the local-checkout path instead.

What the installer creates

macOS/Linux wrapper: ~/.supavector/bin/supavector
Windows wrappers: %USERPROFILE%\.supavector\bin\supavector.ps1 and supavector.cmd
A PATH update so new terminals can find supavector

If your current terminal still says supavector: command not found, open a new shell or add the install bin directory to PATH manually.

What it saves

After onboarding, the CLI stores local config in ~/.supavector/config.json. That file includes the project root, base URL, tenant id, and the saved service token so later CLI commands can work without repeating flags every time.

Repository and source code

Supavector is open source. If you want to inspect the implementation, fork it, or follow the installer and deployment files directly, use the GitHub repository.

Container

Onboarding is the command that turns an installed CLI into a working local Supavector container stack. It writes the env file, stages the local CLI config, starts Docker, runs the bootstrap helper, and then verifies readiness before printing the final ready summary.

Default container behavior

The default path uses the bundled Postgres service from docker-compose.yml. If you already manage your own Postgres, use supavector onboard --external-postgres and the CLI will write .env.external-postgres and use docker-compose.external-postgres.yml instead.

That choice only changes which Postgres database this self-hosted Supavector instance uses. It does not turn the instance into a shared Supavector deployment, and its service tokens still belong only to that self-hosted instance.

Bundled Postgres (default)

supavector onboard

This prompts for admin username, admin password, tenant id, gateway port, a numbered provider choice, a numbered model choice for that provider, and whichever provider API keys are needed. It writes .env and uses docker-compose.yml.

For a normal first run, press Enter at Gateway port [3000]: to keep port 3000, and press Enter at Tenant id [default]: to keep tenant default.

External Postgres

supavector onboard --external-postgres

This adds prompts for PGHOST, PGPORT, PGDATABASE, PGUSER, and PGPASSWORD. It writes .env.external-postgres and uses docker-compose.external-postgres.yml.

Non-interactive example

supavector onboard \
  --non-interactive \
  --admin-user admin \
  --admin-password change_me \
  --tenant default \
  --gateway-port 3000 \
  --answer-provider openai \
  --answer-model 1 \
  --openai-key "$OPENAI_API_KEY"

Non-interactive external Postgres example

supavector onboard \
  --external-postgres \
  --non-interactive \
  --admin-user admin \
  --admin-password change_me \
  --tenant default \
  --gateway-port 3000 \
  --openai-key "$OPENAI_API_KEY" \
  --pg-host 127.0.0.1 \
  --pg-port 5432 \
  --pg-database supavector \
  --pg-user supavector \
  --pg-password change_me

Change local self-hosted models later

supavector changemodel

supavector changemodel \
  --answer-provider openai \
  --answer-model 2 \
  --boolean-ask-model inherit \
  --embed-provider openai \
  --embed-model text-embedding-3-large \
  --reflect-provider openai \
  --reflect-model gpt-4o-mini \
  --compact-model inherit \
  --restart

The CLI now asks for a provider first, then shows the numbered model list for that provider. Generation providers are 1 = openai, 2 = gemini, and 3 = anthropic. Embedding providers are 1 = openai and 2 = gemini. When the provider is OpenAI, the familiar numbered model shortcuts still map to gpt-4o, gpt-4.1, gpt-4o-mini, gpt-4.1-mini, gpt-4.1-nano, gpt-5.2, gpt-5-mini, gpt-5-nano, o1, o3, o3-mini, o4-mini, and custom. Use inherit to clear boolean_ask or compaction back to their fallback behavior. The same numbered shortcuts also work with supavector ask --provider ... --model ... and supavector boolean_ask --provider ... --model ....

Reasoning-style presets such as o1, o3, o4-mini, and the GPT-5 family are supported. Supavector omits unsupported temperature parameters automatically for those models.

What you get at the end

A running local Supavector stack
A bootstrapped admin user for browser login
A first service token saved for CLI and app use
A saved base URL, usually http://localhost:3000

Local container lifecycle

Command	What it does	Example
supavector start	Starts the configured Docker stack	supavector start --build
supavector stop	Stops the stack	supavector stop --down
supavector status	Shows compose state plus health	supavector status --json
supavector logs	Follows service logs	supavector logs --service gateway --tail 200
supavector doctor	Checks local prerequisites and saved config	supavector doctor --json

Finish setup if the stack is already running

If Docker services are already up but the saved config still shows onboarding as incomplete, finish setup on the running stack with:

supavector bootstrap --username your-username --tenant default

Use the same admin username you entered during onboarding. If you pressed Enter at Tenant id [default]:, keep default here too. This completes setup and stores the first service token for later write, search, ask, and boolean_ask commands. supavector doctor now reports this as a pending bootstrap state instead of just showing a missing config.

Hosting

Once CLI setup is complete, Supavector becomes a running local service your own app, backend, worker, or agent can call. The normal runtime inputs are the saved base URL and service token.

1. Confirm setup is complete

supavector status
supavector config show --show-secrets

If the Docker stack is already up but the saved token is still missing, finish setup on that running stack with supavector bootstrap --username YOUR_ADMIN --tenant default.

2. Put the saved runtime values into your app env

export SUPAVECTOR_BASE_URL="http://localhost:3000"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"
export SUPAVECTOR_COLLECTION="default"

# optional if you want request-scoped provider usage
export OPENAI_API_KEY="YOUR_OPENAI_KEY"
export GEMINI_API_KEY="YOUR_GEMINI_KEY"
export ANTHROPIC_API_KEY="YOUR_ANTHROPIC_KEY"

For a deployed Supavector instead of a local stack, keep the same pattern but set SUPAVECTOR_BASE_URL to your public domain such as https://supavector.example.com.

3. Make the first request from your app or script

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "docId": "welcome",
    "collection": "default",
    "text": "Supavector stores memory for agents."
  }'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What does Supavector store?",
    "collection": "default",
    "k": 3
  }'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/boolean_ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Does Supavector store memory for agents?",
    "collection": "default",
    "k": 3
  }'

Keep write and read operations on the same collection. If you write to default, search, ask, and boolean_ask on default too unless you intentionally want a different scope.

4. Use the Node SDK in a real backend or worker

const { SupavectorClient } = require("@supavector/sdk");

const client = new SupavectorClient({
  baseUrl: process.env.SUPAVECTOR_BASE_URL,
  apiKey: process.env.SUPAVECTOR_API_KEY,
  openAiApiKey: process.env.OPENAI_API_KEY,
  geminiApiKey: process.env.GEMINI_API_KEY || process.env.GEMINI_API,
  anthropicApiKey: process.env.ANTHROPIC_API_KEY
});

async function main() {
  await client.indexText("welcome", "Supavector stores memory for agents.", {
    collection: process.env.SUPAVECTOR_COLLECTION || "default"
  });

  const answer = await client.ask("What does Supavector store?", {
    collection: process.env.SUPAVECTOR_COLLECTION || "default",
    k: 3,
    provider: "openai"
  });

  console.log(answer.data.answer);

  const booleanAsk = await client.booleanAsk("Does Supavector store memory for agents?", {
    collection: process.env.SUPAVECTOR_COLLECTION || "default",
    k: 3,
    provider: "openai"
  });

  console.log(booleanAsk.data.answer);
  console.log(booleanAsk.data.supportingChunks);
}

main().catch(console.error);

For a fuller SDK walkthrough, open sdk/node/README.md.

5. Choose the integration shape

Direct server-to-server: your backend, worker, or agent runtime calls Supavector with SUPAVECTOR_BASE_URL and SUPAVECTOR_API_KEY.
Backend-as-caller: your frontend talks to your backend, and your backend holds the Supavector token privately.
Remote deployment: the same code works against a public Supavector deployment by swapping the base URL.

For production apps, keep the Supavector service token on the server side. Do not ship admin credentials or long-lived service tokens into browser code.

Remote hosting mode

If Supavector is already deployed behind nginx or another public reverse proxy, use the CLI as a remote API client. In this mode, you do not run supavector onboard from the client machine unless that machine is also the self-hosted Supavector server.

Needed on the client: the CLI, a public base URL, and a service token
Not needed on the client: Docker, Compose, local bootstrap, or local Postgres
Main remote commands: write, search, ask, and boolean_ask

1. Verify the live deployment

curl -fsS https://YOUR_DOMAIN/v1/health

2. Export the live runtime values on your own machine

export SUPAVECTOR_BASE_URL="https://YOUR_DOMAIN"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"

If you want Supavector to use your own provider key for supported sync requests, also export OPENAI_API_KEY, GEMINI_API_KEY, or ANTHROPIC_API_KEY.

3. Smoke test the live deployment from the CLI

supavector write \
  --doc-id cli-test \
  --collection cli-smoke \
  --text "Supavector CLI remote test."

supavector search \
  --q "remote test" \
  --collection cli-smoke \
  --k 3

supavector ask \
  --question "What does the CLI test document say?" \
  --collection cli-smoke

supavector boolean_ask \
  --question "Does the CLI test document mention Supavector?" \
  --collection cli-smoke

How you get the service token

Use an existing Supavector service token for the target tenant. If the deployment already exists, either ask the admin for a tenant-scoped token or have the admin create one with POST /v1/admin/service-tokens. If you are the operator, the initial token also comes from the bootstrap step on the server.

Service tokens are deployment-scoped. A token created by your local self-hosted Supavector instance will not work against a different shared or hosted Supavector deployment, and vice versa.

Important rule

Think of the CLI in remote mode as a thin client for the live API. Use onboard, start, stop, and logs for local self-hosted control. Use write, search, ask, and boolean_ask to test or use a live deployment.

Commands

Stack and admin commands

Command	What it does	Example
supavector onboard	First-time setup and bootstrap	supavector onboard --external-postgres
supavector start	Starts the configured Docker stack	supavector start --build
supavector stop	Stops the stack	supavector stop --down
supavector status	Shows compose state plus health	supavector status --json
supavector logs	Follows service logs	supavector logs --service gateway --tail 200
supavector doctor	Checks local prerequisites and saved config	supavector doctor --json
supavector bootstrap	Runs bootstrap again on an already running stack	supavector bootstrap --username your-username --tenant default
supavector uninstall	Removes the managed CLI install, PATH hook, and managed local Docker state	supavector uninstall --yes
supavector config show	Shows the saved local config	supavector config show --show-secrets

Content management commands

Command	What it does	Example
supavector collections list	Lists collections visible to the current token	supavector collections list
supavector collections delete	Deletes a whole collection and all docs inside it	supavector collections delete --collection support --yes
supavector docs list	Lists docs in a collection	supavector docs list --collection support
supavector docs delete	Deletes one doc and its vectors	supavector docs delete --doc-id handbook --collection support --yes
supavector docs replace	Deletes and re-indexes one doc under the same doc ID	supavector docs replace --doc-id handbook --file ./support/handbook.md --collection support --yes
supavector write --replace	Single-doc update flow using the normal write command	supavector write --doc-id handbook --file ./support/handbook.md --collection support --replace --yes
supavector write --folder --sync	Reconciles a collection to exactly match a local folder	supavector write --folder ./support --sync --yes

collections delete requires an admin-capable credential. docs delete, docs replace, and write require an indexer-capable credential.

Inspect collections and docs

supavector collections list
supavector docs list --collection customer-support

Write text directly

supavector write \
  --doc-id welcome \
  --collection local-demo \
  --text "Supavector stores memory for agents."

Write from a file

supavector write \
  --doc-id handbook \
  --collection handbook \
  --file ./docs/handbook.md

Write from stdin

cat ./notes.txt | supavector write --doc-id notes-stdin --collection scratchpad

Write from a URL

supavector write \
  --doc-id site-home \
  --collection website \
  --url https://example.com/knowledge-base

Write a whole folder

supavector write --folder ./customer-support

# The folder name becomes the collection:
supavector search --q "refund policy" --collection customer-support --k 5

Folder ingest accepts text-like files and skips unsupported or binary files. If you want a different collection name, add --collection support-v2.

Replace a doc after the source content changes

supavector docs replace \
  --doc-id handbook \
  --collection customer-support \
  --file ./customer-support/handbook.md \
  --yes

# Equivalent update path with the normal write command
supavector write \
  --doc-id handbook \
  --collection customer-support \
  --file ./customer-support/handbook.md \
  --replace \
  --yes

Use this when the same logical document changed and you want the existing docId to stay stable.

Sync a folder-backed collection after files are added, edited, or removed

supavector write --folder ./customer-support --sync --yes

--sync makes the collection match the folder exactly. It replaces matching docs and deletes docs that no longer exist in that folder.

Delete docs or collections

supavector docs delete --doc-id handbook --collection customer-support --yes
supavector collections delete --collection customer-support --yes

Search examples

supavector search --q "memory engine" --collection local-demo --k 5
supavector search --q "tenant auth mode" --collection default --json

Ask and boolean_ask examples

supavector ask --question "What does Supavector store?" --collection local-demo --k 3
supavector ask --question "Summarize the auth model." --collection default --policy amvl --answer-length short --model 2
supavector boolean_ask --question "Does Supavector store memory for agents?" --collection local-demo --k 3
supavector boolean_ask --question "Is password login enabled?" --collection default --policy amvl --model o1

--model is optional. If you omit it, Supavector uses the tenant or instance default model. On the CLI you can use the same numbered shortcuts shown in supavector changemodel, and the live preset catalog is available from GET /v1/models.

Useful write flags

supavector write \
  --doc-id roadmap \
  --text "..." \
  --collection product \
  --policy ttl \
  --expires-at 2026-12-31T00:00:00Z \
  --tags planning,q4 \
  --agent-id planner

Maintenance

Update, change models, or remove later

supavector update
supavector changemodel
supavector uninstall

supavector update updates the managed checkout under ~/.supavector. If the remote main branch was force-pushed, the CLI can still recover as long as that managed checkout is clean.

supavector changemodel updates the local self-hosted env file, supports numbered generation-model choices, and if the embedding model changes it sets REINDEX_ON_START=force for the next restart.

supavector uninstall removes the local wrapper, saved CLI config, installer PATH hook, the managed checkout under ~/.supavector when that checkout exists, and for managed local self-hosted installs it also clears the local Supavector Docker containers and volumes.

Saved config

supavector config show
supavector config show --show-secrets

Common overrides

--project-root to point at a specific Supavector checkout
--base-url to target a different deployment
--api-key to override the saved service token
--tenant and --collection to override saved scope
--openai-key, --gemini-key, and --anthropic-key to send request-scoped provider keys on supported sync routes

Environment variables the CLI understands

SUPAVECTOR_BASE_URL
SUPAVECTOR_URL
SUPAVECTOR_API_KEY
SUPAVECTOR_TOKEN
SUPAVECTOR_OPENAI_API_KEY
OPENAI_API_KEY
SUPAVECTOR_GEMINI_API_KEY
GEMINI_API_KEY
GEMINI_API
SUPAVECTOR_ANTHROPIC_API_KEY
ANTHROPIC_API_KEY
SUPAVECTOR_TENANT_ID
SUPAVECTOR_COLLECTION

Self-hosted model env keys

OPENAI_API_KEY=
GEMINI_API_KEY=
ANTHROPIC_API_KEY=
ANSWER_PROVIDER=openai
ANSWER_MODEL=gpt-4o
BOOLEAN_ASK_PROVIDER=
BOOLEAN_ASK_MODEL=
EMBED_PROVIDER=openai
EMBED_MODEL=text-embedding-3-large
REFLECT_PROVIDER=openai
REFLECT_MODEL=gpt-4o-mini
COMPACT_PROVIDER=
COMPACT_MODEL=gpt-4o-mini

EMBED_PROVIDER and EMBED_MODEL are instance-wide. Changing either requires a reindex because Supavector stores all vectors in one embedding space. Fresh CLI-managed installs and these example env files pin text-embedding-3-large; older installs should pin the value explicitly before changing it. On startup, Supavector also rebuilds vectors automatically if it detects a vector-count or vector-dimension mismatch for the current embedding model.

Generation providers can be openai, gemini, or anthropic. Embedding providers can be openai or gemini. Anthropic is generation-only today. The live provider-aware preset catalog is available from GET /v1/models.

curl http://localhost:3000/v1/models

Common problems

supavector not found: open a new terminal or add ~/.supavector/bin to PATH manually.
Docker missing: install Docker and make sure docker compose version works in the same shell.
No saved config: run supavector onboard first or pass --base-url and --api-key explicitly.
Stack is up but setup is not saved yet: run supavector bootstrap --username your-username --tenant default to finish setup on the running stack.
Gateway health timeout: run supavector logs --service gateway and check the written env file.
Need a fresh token: use supavector bootstrap against the running stack.
Need to update a changed file cleanly: use supavector docs replace or supavector write --replace instead of writing a shorter replacement over the old doc without cleanup.

Examples

Example 1: local bundled stack

./scripts/install.sh
supavector onboard
supavector status
supavector write --doc-id welcome --collection local-demo --text "Supavector stores memory for agents."
supavector ask --question "What does Supavector store?" --collection local-demo
supavector boolean_ask --question "Does Supavector store memory for agents?" --collection local-demo

Example 2: local external Postgres

supavector onboard --external-postgres
supavector doctor
supavector write --doc-id policies --collection compliance --file ./docs/policies.md
supavector search --q "policies" --collection compliance --json

Example 3: folder ingest

supavector write --folder ./customer-support
supavector write --folder ./customer-support --sync --yes
supavector search --q "refund policy" --collection customer-support --k 5
supavector ask --question "What is the refund policy?" --collection customer-support
supavector boolean_ask --question "Does the refund policy mention store credit?" --collection customer-support

Example 4: update one doc after the file changes

supavector docs replace \
  --doc-id handbook \
  --collection customer-support \
  --file ./customer-support/handbook.md \
  --yes

supavector ask \
  --question "Summarize the handbook." \
  --collection customer-support

supavector boolean_ask \
  --question "Does the handbook mention onboarding?" \
  --collection customer-support

Example 5: inspect and debug

supavector status --json
supavector logs --service gateway
supavector doctor --json
supavector config show

Example 6: remote live deployment

export SUPAVECTOR_BASE_URL="https://YOUR_DOMAIN"
export SUPAVECTOR_API_KEY="YOUR_SERVICE_TOKEN"
supavector write --doc-id cli-test --collection cli-smoke --text "Supavector CLI remote test."
supavector search --q "remote test" --collection cli-smoke --k 3
supavector ask --question "What does the CLI test document say?" --collection cli-smoke
supavector boolean_ask --question "Does the CLI test document mention Supavector?" --collection cli-smoke

Choose a setup mode

This tab is the operational setup guide. “Usage mode” means two things together: who runs Supavector, and where your first credentials come from. Pick the mode that matches both.

Before you use this tab

Use CLI Setup if you want Supavector installed, onboarded, and operated from one command-line workflow. Stay on this tab when you want the manual steps by setup mode, including what to clone, what env file to edit, when to run Docker, when not to clone at all, and where the first service token actually comes from.

The same explanation also lives in docs/setup-modes.md in the repository for people reading the markdown docs directly.

What the mode names mean

Supavector Hosted: Supavector runs the infrastructure. You sign up, get an supav_ token from the Dashboard, and call the API. AI generation requires a credit balance topped up from the Dashboard.
Self-host: you clone the repository, run Supavector yourself, and create the first admin and service token yourself.
Shared deployment: Supavector already exists as a service. You do not need to clone the repo just to consume the API.
Backend-as-caller: your own backend holds the Supavector token and makes server-to-server requests on behalf of users.
Human admin: this is the operator path for the UI, tenant settings, SSO setup, and service-token management.

Usage mode name	What it means	Choose this when	First credential step
Supavector Hosted	Supavector runs the infrastructure. You sign up, create a project in the Dashboard, and call the API with an supav_ token.	You do not want to run any server, Docker, or Postgres. You want a working token in under five minutes.	Sign up → Dashboard → New Project → copy your token. Add credit for AI generation.
Self-host: bundled stack	You run Supavector yourself with the bundled Compose stack, including the bundled Postgres service.	You want the fastest path from clone to a working local or single-node Supavector instance.	Clone the repo, start Docker, then run bootstrap_instance.js.
Self-host: BYO Postgres + AI	You run Supavector yourself, but connect it to your own Postgres and secrets setup.	You already have database and infra conventions you want to keep, but still want Supavector inside your environment.	Clone the repo, point Supavector at your Postgres, then run bootstrap_instance.js.
Use an existing Supavector deployment	Supavector is already running somewhere else and you are consuming it as a service.	You only need a base URL and a service token and do not need to operate Supavector yourself.	Ask the admin for credentials, or if you are the admin, sign in and create a token in shared deployment setup.
Existing deployment + your provider key	You still use an existing Supavector deployment, but send your own provider key per request.	You want Supavector to keep its own runtime and data plane, but want your own provider billing/key on supported sync routes.	Get the same base URL and service token as shared mode, then follow shared + your AI key.
Backend-as-caller	Your backend is the only system that talks directly to Supavector; browsers and end users do not hold Supavector credentials.	Your own product already has auth and you want Supavector behind your API/server layer.	First get a service token from a self-host or shared deployment, then store it only on your backend as shown in backend-as-caller setup.
Human admin / browser UI	A person signs in interactively to use the UI, manage tenant settings, or mint service tokens.	You are operating Supavector itself, not wiring a headless app or agent runtime yet.	Log in with /v1/login or SSO, then use human admin setup to mint machine credentials.

Common rule

For apps and agents, the normal runtime credential is a service token. That token comes from either the initial bootstrap_instance.js run or a later POST /v1/admin/service-tokens call by an admin. Username/password is mainly for a human admin or for the one-time bootstrap step.

What changes by mode

The fastest way to avoid setup mistakes is to answer four questions up front:

Are you running Supavector yourself, or is it already running somewhere else?
Will you use the bundled Postgres container, or your own Postgres?
Do you need Supavector to keep its own provider key, or should each request use your provider key?
Will browsers/users talk to Supavector directly, or only through your backend?

Mode boundary table

Mode	Clone repo?	Run Docker?	Edit env file?	Where the first service token comes from
Supavector Hosted	No	No	No	Dashboard sign-up → create project → token shown once
Self-host: bundled stack	Yes	Yes	.env	Your own bootstrap_instance.js run
Self-host: BYO Postgres	Yes	Yes	.env.external-postgres	Your own bootstrap_instance.js run
Use an existing deployment	No, unless you are also the operator	No on the client machine	No Supavector server env on the client machine	An Supavector admin gives it to you, or creates it in the UI/API
Existing deployment + your provider key	No, unless you are also the operator	No on the client machine	No Supavector server env on the client machine	Same service token path as shared deployment
Backend-as-caller	Only if you self-host	Only on the Supavector server side	Only on the backend / Supavector server side	From whichever self-hosted or shared deployment your backend uses
Human admin	Only if you self-host	Only if you self-host	Only if you self-host	The admin creates later machine tokens for apps and agents

What not to mix

Do not run supavector onboard on a client machine that is only consuming an already running shared deployment.
Do not copy a token from one Supavector deployment and expect it to work against another deployment.
Do not assume --external-postgres means “hosted by Supavector.” It is still your own self-hosted Supavector deployment.
Do not give browsers long-lived admin or service tokens if your backend can hold them instead.

Mode-by-mode setup

Each guide below is written as a concrete setup flow. Self-hosted modes start from repository checkout. Shared modes start from existing Supavector credentials and do not require cloning Supavector just to consume the API.

Best for

The fastest self-hosted path when you want to go from repository checkout to a working local or single-node Supavector instance with the least infra setup.

You are setting up

Your own Supavector server
The bundled Supavector Postgres container
Your own first admin and your own first service token

You are not setting up

A shared Supavector deployment run by someone else
Your own external Postgres server
A client-only integration with an already running Supavector service

Before you start

Git installed so you can clone the repository locally
Docker and the Compose plugin installed
One or more provider keys such as OPENAI_API_KEY, GEMINI_API_KEY, or ANTHROPIC_API_KEY
No existing Postgres required

1. Decide whether you need a fork

Fork the repo on GitHub if you want your own copy to modify and push to. If you only need a local checkout to run Supavector, you can skip the GitHub fork and clone directly.

2. Clone to your machine

# If you forked the repo on GitHub, clone your fork:
git clone https://github.com/<your-org>/supavector.git
cd supavector

# If you did not fork and just want the current repo directly:
git clone https://github.com/Emmanuel-Bamidele/supavector.git
cd supavector

3. Create your env file

cp .env.example .env
# edit .env and set:
# OPENAI_API_KEY and/or GEMINI_API_KEY and/or ANTHROPIC_API_KEY
# POSTGRES_PASSWORD
# JWT_SECRET
# COOKIE_SECRET

4. Start the stack

docker compose up -d --build
curl -fsS http://localhost:3000/health

5. Bootstrap the first admin and service token

docker compose exec gateway node scripts/bootstrap_instance.js \
  --username admin \
  --password change_me \
  --tenant default \
  --service-token-name app-bootstrap

This is the exact moment the first service token is created. The bootstrap helper creates the admin if needed and prints the base URL plus the token to store in your app or agent env.

6. Save the runtime env for your app or agent

SUPAVECTOR_BASE_URL=http://localhost:3000
SUPAVECTOR_API_KEY=YOUR_SERVICE_TOKEN

7. Smoke test

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Idempotency-Key: bundled-001" \
  -H "Content-Type: application/json" \
  -d '{"docId":"hello","text":"Supavector stores memory for agents."}'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"What does Supavector store?","k":3}'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/boolean_ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"Does Supavector store memory for agents?","k":3}'

Best for

Teams that already have Postgres, secrets management, backups, and deployment standards, but still want to self-host Supavector.

You are setting up

Your own Supavector server
Your own Postgres database for Supavector
Your own first admin and your own first service token

You are not setting up

The bundled Supavector Postgres container as your data store
A shared Supavector deployment run by another team
A browser-only login path without server-side service tokens

Before you start

Git installed so you can clone the repository locally
A reachable Postgres instance and credentials for a dedicated Supavector database
One or more provider keys such as OPENAI_API_KEY, GEMINI_API_KEY, or ANTHROPIC_API_KEY
JWT_SECRET and COOKIE_SECRET

1. Decide whether you need a fork

Fork the repo on GitHub if you want your own copy to modify and push to. If you only need to deploy Supavector in your environment, you can clone directly without forking first.

2. Clone to your machine

# If you forked the repo on GitHub, clone your fork:
git clone https://github.com/<your-org>/supavector.git
cd supavector

# If you did not fork and just want the current repo directly:
git clone https://github.com/Emmanuel-Bamidele/supavector.git
cd supavector

3. Prepare a Postgres database

-- Example only. Use your own Postgres workflow if you already manage this elsewhere.
CREATE DATABASE supavector;
CREATE USER supavector WITH PASSWORD 'change_me';
GRANT ALL PRIVILEGES ON DATABASE supavector TO supavector;

4. Create the external-Postgres env file

cp .env.external-postgres.example .env.external-postgres
# edit .env.external-postgres and set:
# PGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD
# OPENAI_API_KEY and/or GEMINI_API_KEY and/or ANTHROPIC_API_KEY
# JWT_SECRET
# COOKIE_SECRET

5. Start Supavector without the bundled Postgres service

docker compose -f docker-compose.external-postgres.yml \
  --env-file .env.external-postgres up -d --build

curl -fsS http://localhost:3000/health

6. Bootstrap the first admin and service token

docker compose -f docker-compose.external-postgres.yml \
  --env-file .env.external-postgres exec gateway \
  node scripts/bootstrap_instance.js \
  --username admin \
  --password change_me \
  --tenant default \
  --service-token-name app-bootstrap

This is where the first service token is created for the external-Postgres deployment too.

7. Save the runtime env for your app or agent

SUPAVECTOR_BASE_URL=http://localhost:3000
SUPAVECTOR_API_KEY=YOUR_SERVICE_TOKEN

8. Smoke test the external-Postgres path

curl -X GET "${SUPAVECTOR_BASE_URL}/v1/search?q=supavector&k=3" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}"

Best for

Developers who want a working Supavector API token in under five minutes without running any infrastructure. Supavector manages the server, Postgres, and AI provider — you sign up, create a project, and call the API.

You are setting up

A Dashboard account (Google, GitHub, or email sign-in)
One or more projects — each is an isolated Supavector tenant
A credit balance for AI generation

You are not setting up

Any server, Docker container, or Compose file
.env files or bootstrap scripts
Your own Postgres database for Supavector

1. Sign up and create a project

Go to the hosted instance and sign in with Google, GitHub, or email.
Click Dashboard in the nav.
Click + New Project, enter a name, and click Create.
Copy the token shown — it is only displayed once.

Tokens from the hosted service start with supav_. If you close the dialog without copying the token, create a new one from the project's token list and revoke the old one.

2. Add credits

AI generation (/ask, /boolean_ask) requires a positive credit balance. Indexing, search, and memory operations are not credit-gated. Write operations (/v1/docs, /v1/memory/write) do update the storage meter, which is tracked separately and shown in the Dashboard.

In the Dashboard, click + Add Credit in the Credits & Billing card.
Choose a preset amount or enter a custom USD amount.
Complete payment in Stripe Checkout. You are redirected back to the Dashboard with your balance updated.

In test mode, use Stripe card 4242 4242 4242 4242 with any future expiry, any CVC, and any ZIP.

3. Store your runtime env

SUPAVECTOR_BASE_URL=https://YOUR_HOSTED_DOMAIN
SUPAVECTOR_API_KEY=supav_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

4. Make your first requests

# Index a document (no credit needed)
curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "Authorization: Bearer ${SUPAVECTOR_API_KEY}" \
  -H "Idempotency-Key: hosted-001" \
  -H "Content-Type: application/json" \
  -d '{"docId":"welcome","collection":"default","text":"Supavector stores memory for agents."}'

# Ask a question (deducts credit)
curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "Authorization: Bearer ${SUPAVECTOR_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"What does Supavector store?","k":5,"policy":"amvl"}'

The hosted instance supplies the AI provider. You do not need to send X-OpenAI-API-Key unless you want to override the server default.

Error codes specific to hosted tokens

HTTP status	code field	Meaning	Fix
402	CREDIT_REQUIRED	Account credit balance is zero. Generation is paused.	Add credit from the Dashboard, then retry.
503	CREDIT_CHECK_FAILED	Transient error verifying the credit balance. The server blocks rather than allows an unverified request.	Retry with exponential backoff. Does not mean the balance is zero.
401	—	Token is missing, malformed, revoked, or from a different deployment.	Verify the token starts with supav_ and is active in the Dashboard.

Handling CREDIT_REQUIRED in code:

const res = await fetch(`${BASE}/v1/ask`, { method: "POST", headers, body });
const data = await res.json();

if (res.status === 402 && data.code === "CREDIT_REQUIRED") {
  // Prompt user to top up, or queue for retry after top-up
  throw new BillingError("No credits remaining.");
}
if (!res.ok) throw new Error(data.error || "Supavector error");

Managing tokens

From the Dashboard, on each project's token list you can:

Create additional tokens — useful for separating production, staging, and CI
Revoke a token — immediately invalidates it with no grace period
See last used time — per-token visibility into which are active

The plain token value is only shown at creation time. If you need to copy it again in the same browser session, click Copy on the token row. If the session has expired, create a new token and revoke the old one.

Hosted vs self-hosted

	Supavector Hosted	Self-Hosted
Run Docker	No	Yes
Manage Postgres	No	Yes
Token comes from	Dashboard sign-up	bootstrap_instance.js
Token prefix	supav_	No fixed prefix
AI generation billing	Credit balance in Dashboard	Your own provider key billed directly
Storage billing	Measured per write, shown in Dashboard	Not tracked
402 / 503 credit errors	Yes — for supav_ tokens with a project	Never

Best for

Using an Supavector deployment that already exists. You are not hosting Supavector yourself in this mode, and you should think of Supavector like any other internal service you call over HTTP.

You are setting up

A runtime integration that talks to an existing Supavector deployment
Your local app or backend env with SUPAVECTOR_BASE_URL and SUPAVECTOR_API_KEY

You are not setting up

The Supavector server itself
Docker, Compose, Postgres, or Supavector server env files on the client machine
The first service token from scratch unless you are also the admin

Important distinction

You do not fork or clone the Supavector repository for this mode unless you are also the person operating the deployment. In this mode, the only things you need are credentials plus the service URL.

What you need from the Supavector admin

The shared SUPAVECTOR_BASE_URL
A service token scoped to the tenant you should use

How the service token is created in this mode

There are only two normal paths:

An existing Supavector admin creates a token and gives it to you.
You are that admin, so you sign in and create the token yourself.

# If you are the admin of the shared deployment:
curl -X POST "${SUPAVECTOR_BASE_URL}/v1/login" \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"change_me"}'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/admin/service-tokens" \
  -H "Authorization: Bearer YOUR_ADMIN_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "shared-app",
    "principalId": "svc:shared-app",
    "roles": ["indexer"]
  }'

If you are not the admin, skip those commands and ask the admin for the base URL plus the token.

1. Store the runtime env

SUPAVECTOR_BASE_URL=https://supavector.example.com
SUPAVECTOR_API_KEY=YOUR_SERVICE_TOKEN

2. Verify the deployment is reachable

curl -fsS "${SUPAVECTOR_BASE_URL}/v1/health"

3. Make your first authenticated request

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/docs" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "Idempotency-Key: shared-001" \
  -H "Content-Type: application/json" \
  -d '{"docId":"welcome","text":"Supavector stores memory for agents."}'

4. Use the same token from your app or agent

Your runtime only needs SUPAVECTOR_BASE_URL plus SUPAVECTOR_API_KEY. Do not make every end user log in directly unless you intentionally want human sessions in Supavector.

Best for

Keeping Supavector on the shared deployment while choosing your own provider key on a request-by-request basis.

What changes in this mode

You still use the same SUPAVECTOR_BASE_URL and Supavector service token as shared mode.
You add a request-scoped provider-key header on supported sync requests.
You may also set provider in ask or boolean_ask request bodies.

What does not change

Supavector still owns the shared deployment, Postgres, and auth state.
Your provider key is not persisted into Supavector server-side config just by sending the header.
The service token still comes from the shared deployment admin path.

Credential path

This mode does not change how you get Supavector credentials. First complete the shared deployment path to get SUPAVECTOR_BASE_URL plus a service token. This mode only adds a request-scoped provider-key header on top of that.

What you need

SUPAVECTOR_BASE_URL for the shared deployment
An Supavector service token or JWT
Your own provider key: OPENAI_API_KEY, GEMINI_API_KEY, or ANTHROPIC_API_KEY

1. Store your runtime env

SUPAVECTOR_BASE_URL=https://supavector.example.com
SUPAVECTOR_API_KEY=YOUR_SERVICE_TOKEN
OPENAI_API_KEY=YOUR_OPENAI_KEY
GEMINI_API_KEY=YOUR_GEMINI_KEY
ANTHROPIC_API_KEY=YOUR_ANTHROPIC_KEY

2. Send both headers on supported sync requests

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "X-Gemini-API-Key: ${GEMINI_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"What does Supavector store?","k":3,"policy":"amvl","provider":"gemini","model":"gemini-2.5-flash"}'

curl -X POST "${SUPAVECTOR_BASE_URL}/v1/boolean_ask" \
  -H "X-API-Key: ${SUPAVECTOR_API_KEY}" \
  -H "X-Anthropic-API-Key: ${ANTHROPIC_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"question":"Does Supavector store memory for agents?","k":3,"policy":"amvl","provider":"anthropic","model":"claude-sonnet-4-20250514"}'

Supported today

POST /v1/docs
POST /v1/docs/url
GET /v1/search
POST /v1/ask
POST /v1/boolean_ask
POST /v1/memory/write
POST /v1/memory/recall

Important note

For ask and boolean_ask, the JSON body can override the generation provider with provider. For docs, search, memory write, and memory recall, the embedding provider remains instance-wide today, so request-scoped provider-key headers only override credentials for the provider that the instance is already configured to use.

Current limitation

/v1/memory/reflect and /v1/memory/compact reject request-scoped provider-key headers because those jobs continue asynchronously after the request returns.

Best for

Products that already have their own auth and want Supavector to stay behind their server or worker layer.

You are setting up

Your backend as the only machine that talks directly to Supavector
Server-side storage for Supavector base URL, service token, and optional provider keys

You are not setting up

Supavector credentials in browser code
A separate credential type beyond the normal Supavector service token

Architecture rule

The browser or end-user app talks to your backend. Your backend talks to Supavector. The Supavector service token stays server-side.

1. Get or create Supavector credentials

This mode does not create a new credential type. You still need the same Supavector service token as every other machine client. Get it in one of these ways:

If you self-host Supavector, run bootstrap_instance.js during setup.
If Supavector already exists, ask the admin for a token.
If you are the admin, sign in and create a token with POST /v1/admin/service-tokens.

Once you have that token, this backend pattern becomes the steady-state integration model.

2. Store Supavector env only on the server

SUPAVECTOR_BASE_URL=https://supavector.example.com
SUPAVECTOR_API_KEY=YOUR_SERVICE_TOKEN

3. Forward only approved operations

User authenticates to your app.
Your backend decides what Supavector action is allowed.
Your backend calls Supavector with the service token.
Your backend returns the result to the client.

4. Optional user-level visibility

Only use principal override if you intentionally want your backend to project end-user identity into Supavector. Keep that logic server-side and do not expose admin tokens to the browser.

Best for

Human operators using the browser UI, tenant settings, service-token management, or interactive Playground flows.

What this mode is for

Signing in interactively
Managing tenant settings, SSO, and service tokens
Handing machine credentials to apps and agents after the admin work is done

1. Reach an Supavector deployment

If you are self-hosting, complete either the bundled-stack or BYO-Postgres flow first. If your team already runs Supavector, use the shared deployment URL you were given.

2. Sign in

curl -X POST http://localhost:3000/v1/login \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"change_me"}'

That response contains the admin JWT. Use that JWT for browser/admin operations and for minting additional service tokens.

3. Open the UI

Use the browser UI for Playground, Settings, docs, and tenant operations. A human admin can also mint long-lived service tokens for apps and agents.

4. Create machine credentials for the real runtime

curl -X POST http://localhost:3000/v1/admin/service-tokens \
  -H "Authorization: Bearer YOUR_ADMIN_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "app-runtime",
    "principalId": "svc:app-runtime",
    "roles": ["indexer"]
  }'

Use the minimum role your runtime actually needs. For many ingestion/search app flows, indexer is enough. Reserve admin for true admin automation.

5. Hand the service token to the app or agent

Humans usually log in once for admin work. Apps and agents should normally run on service tokens after that.

Credential rules and boundaries

Service token rules

Service tokens are deployment-scoped. A token minted by one Supavector deployment does not authenticate against another deployment.
For apps, backends, workers, and agents, the normal runtime inputs are SUPAVECTOR_BASE_URL and SUPAVECTOR_API_KEY.
Username/password is mainly for the one-time bootstrap path or for human admin login.

Provider-key rules

Self-hosted Supavector can keep provider keys in its own env file.
Shared deployments can also accept request-scoped provider-key headers on supported sync routes.
ask and boolean_ask can override generation provider/model per request.
Embedding provider selection stays instance-wide today. Request-scoped provider headers do not change the embedding provider itself.

Database rules

Bundled Postgres and external Postgres are both still self-hosted Supavector.
--external-postgres changes where Supavector stores relational state. It does not turn the instance into a shared Supavector platform deployment.
Shared deployment users normally do not edit Supavector server env files or Compose files at all.

Common mistakes

Avoid these setup mistakes

Running local bootstrap on the wrong machine: if you are only consuming an existing shared deployment, do not clone the repo just to get credentials unless you are also the operator.
Treating external Postgres as hosted Supavector: using your own Postgres still means you are self-hosting Supavector yourself.
Mixing human login and machine runtime: the browser/admin path is not the normal long-running runtime path for apps and agents.
Expecting provider headers to persist server-side config: request-scoped provider-key headers are request-scoped; they do not automatically reconfigure the server default provider.
Putting long-lived tokens in the browser: if you have your own backend, keep the Supavector service token there instead.
Forgetting where the first token comes from: self-hosted flows create it during bootstrap; shared flows receive it from an existing admin path.

Memory Policy Guide

TTL, LRU, and AMVL

Supavector supports three memory policies. They use the same API surface, but they make different tradeoffs about retention and retrieval.

Default

amvl is the default when policy is omitted.

Where Set

Pass policy on ask and memory APIs. Use ttlSeconds or expiresAt on writes when you want explicit expiry.

Rule Of Thumb

Use TTL for fixed freshness windows, LRU for cache-like recency, and AMVL for Supavector’s main long-term memory behavior.

TTL

ttl is the simplest policy: retention is driven by time. Items remain available until their expiry and are removed when the TTL window ends.

Use TTL when

Your data has a clear freshness window.
You want predictable, time-based retention for operational or compliance reasons.
You do not need value-based promotion or recency heuristics.

Important behavior

Set expiry on writes with ttlSeconds or expiresAt.
Expired items are cleaned up automatically.
TTL is easy to reason about, but useful memories can still disappear simply because the clock ran out.

Good default if

You know the retention window in advance, such as session memory, short-lived incident notes, or content that should age out on a fixed schedule.

LRU

lru is the cache-like policy. It favors recent use and is best when you want behavior closer to traditional least-recently-used memory.

Use LRU when

You want a familiar recency-based baseline.
You are benchmarking Supavector against simpler cache behavior.
Your workload is mostly about the current working set, not long-term usefulness.

Important behavior

Recent access matters more than contribution or long-term value.
This is useful for short-horizon workloads, but older high-value memories can lose priority if they stop getting touched.
Choose this when you want simpler recency behavior, not Supavector’s value-driven memory policy.

Good default if

You want a straightforward cache-style policy for ephemeral or highly interactive workloads where most value comes from what was used recently.

AMVL

amvl is Supavector’s default and recommended policy. It is designed for longer-lived memory where retention and retrieval should reflect usefulness, not just age or recent access.

Use AMVL when

You want Supavector’s main intended behavior.
You want retention and retrieval influenced by access, contribution, and lifecycle signals.
You want better long-term usefulness than plain TTL or LRU can offer.

Important behavior

amvl is the default when you omit policy.
It uses value and lifecycle signals to keep useful memory easier to retrieve over time.
It is the best starting point unless you have a specific reason to force fixed expiry or cache-style recency.

Good default if

You are building a long-running agent, assistant, or knowledge workflow and want Supavector to optimize for useful memory rather than only age or last-touch time.

Loading…

Service token

Stored in localStorage — never sent to any server except your own Supavector instance. Token is masked.

Provider key overrides (optional)

OpenAI

Sent as X-OpenAI-API-Key.

Gemini

Sent as X-Gemini-API-Key.

Anthropic

Sent as X-Anthropic-API-Key.

Admin login (username + password)

Username

Password

Create API key (admin)

Admins can issue service tokens for apps and agents. The bootstrap helper can mint the first one for you.

Advanced options

Principal ID (optional)

Defaults to your token subject if omitted.

Roles (comma separated)

Valid roles: admin, indexer, reader.

Expires at

New API key (masked)

(not created)

Tenant ID

Tenant name

Auth mode

Admins can enforce SSO-only or password-only login per tenant.

Allowed SSO providers

Uncheck all to disable SSO for this tenant.

Google Azure Okta

Ask provider

Optional tenant override for the ask provider.

Ask model

Optional tenant override for ask. Leave blank to inherit the instance default. The live preset catalog is also available from GET /v1/models.

Boolean ask provider

Optional tenant override for the boolean_ask provider.

Boolean ask model

Optional tenant override for boolean_ask. Leave blank to follow the tenant ask model.

Reflect provider

Optional tenant override for reflection jobs.

Reflect model

Optional tenant override for the reflection model.

Compact provider

Optional tenant override for compaction jobs.

Compact model

Optional tenant override for compaction jobs. Leave blank to follow the tenant reflect model.

Embedding provider

Instance-wide setting. Change it in the self-hosted env or with supavector changemodel.

Embedding model

Instance-wide setting. Change it in the self-hosted env or with supavector changemodel. Changing the embedding model requires a reindex on restart.