Routing failover

Configure ordered alternative routes for chat completions and understand how route switching is surfaced in responses.

Use routing failover when one Tresor request may move from one route to another inside the router. Use retries when your client or SDK sends a new request after an error response or a connection failure.

Two different mechanisms

MechanismControlled byWhen it happensHow you observe it
Routing failoverTresor routerInside one request, before routing gives uptresor.requested_route, tresor.routed_model, tresor.failover
RetryClient / SDKAfter a request fails or a connection dropsHTTP status, retry logs, client metrics

If you need backoff, retry budgets, or timeout guidance, see Retries and transient errors.

How routing failover works

At a high level, the router follows a simple sequence:

  1. It determines the preferred route from your request.
  2. It builds the set of routes it is allowed to use. For chat completions, that means the primary model plus any explicit failover entries you provided. For transcriptions, that can mean the automatically resolved routes that match your requested model and constraints.
  3. It tries the preferred route first.
  4. If that route cannot be used or fails before the request completes, the router moves to the next eligible route.
  5. If one route succeeds, the response reports the requested route via tresor.requested_route, the actual route via tresor.routed_model, and tresor.failover is true if the router had to leave the preferred route.
  6. If no eligible route succeeds, the request fails.

That is the stable contract clients should rely on. The exact health checks and internal routing heuristics can evolve over time.

Chat completions: explicit routing failover

POST /v1/chat/completions accepts a Tresor-only failover field. It is an ordered list of alternative compound model IDs that the router may try if the primary route is unavailable.

curl https://api.tresor.co/v1/chat/completions \
  -H "Authorization: Bearer $TRESOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "eu/privatemode/kimi-k2.6",
    "messages": [{"role": "user", "content": "Summarize this contract."}],
    "failover": [
      "global/chutes/kimi-k2.6",
      "global/tinfoil/deepseek-v4-pro"
    ]
  }'

In OpenAI SDKs, pass Tresor-only request fields through extra_body:

from openai import OpenAI

client = OpenAI(
    api_key="tr-...",
    base_url="https://api.tresor.co/v1",
)

resp = client.chat.completions.create(
    model="lux/tresor/kimi-k2.5",
    messages=[{"role": "user", "content": "Summarize this contract."}],
    extra_body={
        "failover": [
            "global/chutes/kimi-k2.6",
            "global/tinfoil/deepseek-v4-pro",
        ]
    },
)

Chat failover rules

  • The router tries the primary model first, then each failover entry in order.
  • Each failover entry must be a valid compound model ID, just like the primary route.
  • Maximum 5 failover entries.
  • The selected route can have different pricing from the primary route. By listing a route, you accept that route's pricing if the router uses it.
  • If all listed routes are unavailable, the request fails with 503.

Use GET /v1/models?detail=true to inspect available providers, regions, and pricing before building a failover list.

What the response tells you

Tresor reports the route that actually served the request:

  • tresor.requested_route: the normalized primary route you asked for
  • tresor.routed_model: the final compound model ID used
  • tresor.failover: true when the router had to leave the preferred route and use a secondary route instead

When you use automatic routing such as a bare model key or auto/auto/..., tresor.requested_route and tresor.routed_model can differ even when tresor.failover is false. That is automatic resolution, not failover.

That metadata appears on the non-streaming response object, or on the final SSE chunk for streaming responses.

Audio transcriptions: router-managed route switching

POST /v1/audio/transcriptions does not accept a client-provided failover array.

Instead, transcription routing depends on how specific your model is:

  • If you send an explicit compound route such as eu/privatemode/whisper-large-v3, the router either uses that route or fails.
  • If you send a bare model key or an automatic route such as auto/auto/whisper-large-v3, the router may choose another eligible route before giving up.

When that happens, the response exposes both tresor.requested_route and tresor.routed_model, and tresor.failover is set to true only if the router had to switch away from the initially preferred route after resolution began.

If you need predictable pricing, provider choice, or residency, pin an explicit compound route instead of relying on automatic resolution.

Choosing routes deliberately

  • Prefer the same model_key across your primary and alternative chat routes unless you explicitly want different behavior.
  • Only include alternative routes whose provider, region, and price you are willing to accept.
  • Reserve retries for transient transport or HTTP failures. Routing failover does not replace client-side retry logic.

See also