← Back to home

Documentation

Everything you need to integrate SpendexAI into your application.

On this page

Quickstart

SpendexAI is an OpenAI-compatible proxy. Change two lines and start saving immediately.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="spx_sk_live_...",
    base_url="https://api.spendexai.com/v1"
)

response = client.chat.completions.create(
    model="auto",  # Smart routing picks the optimal model
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "spx_sk_live_...",
  baseURL: "https://api.spendexai.com/v1"
});

const response = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Hello!" }]
});
console.log(response.choices[0].message.content);

Swift

let config = OpenAI.Configuration(
    apiKey: "spx_sk_live_...",
    baseURL: "https://api.spendexai.com/v1"
)

let client = OpenAI(configuration: config)
let response = try await client.chat.completions.create(
    model: "auto",
    messages: [.user("Hello!")]
)
print(response.choices[0].message.content)

cURL

curl https://api.spendexai.com/v1/chat/completions \
  -H "Authorization: Bearer spx_sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Authentication

All API requests require an API key prefixed with spx_. Generate keys from your dashboard.

Pass your key via the Authorization header:

Authorization: Bearer spx_sk_live_...

You can create multiple API keys for different environments and teams.

Chat Completions

POST /v1/chat/completions

Fully compatible with the OpenAI Chat Completions API. Accepts the same request body and returns the same response format.

Base URL: https://api.spendexai.com

Request body

ParameterTypeDescription
modelstring"auto" for smart routing, or a specific model name
messagesarrayArray of message objects with role and content
temperaturenumberSampling temperature (0–2). Default: 1
max_tokensintegerMaximum tokens to generate
streambooleanEnable streaming responses. Default: false
top_pnumberNucleus sampling parameter (0–1)
stopstring | arrayStop sequences
toolsarrayTool/function definitions for function calling

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709000000,
  "model": "gpt-5-nano",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 12,
    "total_tokens": 22
  },
  "spendex": {
    "routed_model": "gpt-5-nano",
    "tier": "simple",
    "cost_usd": 0.00003,
    "saved_usd": 0.00042
  }
}

Responses may include Spendex routing metadata such as the selected model and estimated savings.

Model Routing

Smart routing (model: "auto")

When you set model: "auto", our AI classifier analyzes each request and routes it to the lowest-cost model that still fits the task’s quality, capability, and reliability requirements. Signals include:

Specific model

Pass any supported model name to bypass smart routing:

model: "gpt-5"              // Routes directly to OpenAI GPT-5
model: "claude-sonnet-4-6"  // Routes directly to Anthropic Claude Sonnet 4.6
model: "gemini-2.5-pro"  // Routes directly to Google Gemini 2.5 Pro
model: "auto"            // Smart routing (recommended)

Cost tiers

TierWhen it's usedExample models
SimpleShort, factual, single-turn queriesGPT-5-nano, Haiku 4.5, Flash Lite
MediumMulti-turn, moderate reasoning, codeGPT-5-mini, Sonnet 4.6, Flash
ComplexLong context, deep reasoning, criticalGPT-5, Opus 4.6, Gemini 2.5 Pro

Streaming

Set stream: true to receive Server-Sent Events (SSE) as the response is generated.

Python

stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Node.js

const stream = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Explain quantum computing" }],
  stream: true
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

cURL

curl https://api.spendexai.com/v1/chat/completions \
  -H "Authorization: Bearer spx_sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Explain quantum computing"}],
    "stream": true
  }'

Supported Models

We currently optimize across the providers connected in your workspace. Core routing today is centered on OpenAI, Anthropic, Google, and Mistral, with broader provider support expanding across the platform.

ModelProviderPrice (input / 1M)
GPT-5-nanoOpenAI$0.05
Claude Haiku 4.5Anthropic$1.00
Gemini 2.5 Flash LiteGoogle$0.10
Mistral SmallMistral$0.10
GPT-5-miniOpenAI$0.25
Claude Sonnet 4.6Anthropic$3.00
Gemini 2.5 FlashGoogle$0.30
Mistral LargeMistral$0.50
GPT-5OpenAI$1.25
Claude Opus 4.6Anthropic$5.00
Gemini 2.5 ProGoogle$1.25

BYOK live now. Credits coming soon.

Available Now / Coming Next

Available now

Coming next

Budget controls, alerts, and agent-level spend policies are part of the next layer on top of the BYOK router. These features are being rolled out progressively.

Error Handling

SpendexAI returns standard HTTP status codes and OpenAI-compatible error objects.

CodeMeaning
400Bad request — invalid parameters
401Unauthorized — invalid or missing API key
402Payment required — managed credits unavailable or billing limit reached
429Rate limit exceeded — too many requests
500Internal error — retry with exponential backoff
503Provider unavailable — automatic failover in progress
{
  "error": {
    "message": "Payment required for this request.",
    "type": "billing_error",
    "code": "payment_required"
  }
}

Automatic failover

If a provider returns a 5xx error, SpendexAI automatically retries with an equivalent model from another provider. This happens transparently — you get a successful response with the spendex.routed_model field showing which model was used.

Migration Guide

Switching to SpendexAI takes under a minute. You only need to change two things:

  1. API key: Replace your provider key with your SpendexAI key (spx_sk_live_...)
  2. Base URL: Point to https://api.spendexai.com/v1

In most OpenAI-compatible integrations, your existing code, prompts, streaming, and tool definitions continue to work with minimal changes. To revert, remove the two lines — back to normal in 10 seconds.

Before

client = OpenAI(api_key="sk-...")

After

client = OpenAI(
    api_key="spx_sk_live_...",
    base_url="https://api.spendexai.com/v1"
)

What stays the same

FAQ

How is SpendexAI different from OpenRouter?

OpenRouter is a strong option if you want a hosted model access layer across many providers. SpendexAI is different: it is built BYOK-first. You connect your own OpenAI, Anthropic, Google, Mistral, and other provider accounts, keep direct billing with those providers, and use SpendexAI as the routing, retry, and failover layer in front.

How is SpendexAI different from open-source routers like LiteLLM or Portkey Gateway?

Tools like LiteLLM and Portkey can be a good fit if you want to assemble, host, and operate your own routing stack. SpendexAI is for teams that want the outcome without the operational overhead: one endpoint, automatic routing, fallback handling, and a simpler path to production.

Why use SpendexAI instead of building routing in-house?

Because most in-house routers start simple, then grow into provider-specific logic, retries, failover rules, and model mapping spread across the codebase. SpendexAI keeps that logic in one place behind one OpenAI-compatible endpoint.

Do I lose control over model choice?

No. If you need a specific provider or model, you can pin it directly. Smart routing only applies when you choose automatic routing.

What changes in my code?

Usually just two things: the base URL and the API key. Your existing OpenAI-compatible SDK flow stays nearly the same.

Why BYOK first?

Because many teams want better routing without giving up direct provider relationships, billing visibility, and account-level control. BYOK lets you keep ownership of the underlying provider accounts while SpendexAI handles the routing layer.

Who pays the providers in BYOK mode?

You do. In BYOK mode, usage is billed directly to your OpenAI, Anthropic, Google, Mistral, and other connected provider accounts. SpendexAI sits in front as the intelligence and reliability layer.

Is SpendexAI only about cost savings?

No. Cost savings matter, but the bigger value is operational: better model selection, cleaner multi-provider architecture, retries, failover, and one consistent interface for your app.

When should I use SpendexAI instead of OpenRouter?

Use SpendexAI when you want BYOK, direct provider billing, and routing on top of your own accounts. If you want a hosted aggregation layer where the platform sits between you and the providers commercially, OpenRouter may be a better fit.

When should I use an open-source router instead of SpendexAI?

Use an open-source router if your team wants full infrastructure ownership and is comfortable maintaining the routing stack itself. Use SpendexAI if you want the same class of routing outcome with less engineering overhead.

Can I still force one provider only?

Yes. If you only want OpenAI, or only Anthropic, you can keep routing constrained to the providers you connect and choose.

What happens if a provider goes down?

SpendexAI can retry or reroute to another suitable connected provider or model. Your application still talks to one endpoint.

Can I switch back instantly?

Yes. Remove the SpendexAI base URL and point your SDK back to your original provider. There is no lock-in.

Do you store prompts?

Prompt content is not used as product analytics. Routing decisions rely on request signals and metadata needed to operate the router.

Need Help?

Email us at contact@spendexai.com or book a call.