POST /v1/audio/transcriptionsUpload an audio file and receive a normalized transcript. The endpoint follows the OpenAI audio transcription shape closely, with Tresor receipt metadata added on JSON responses.
| Header | Notes |
|---|---|
Authorization | Bearer tr-… API key. Required. |
X-Tresor-Receipt | Set to false to opt out of signed receipts. Default true. |
Send multipart/form-data.
| Parameter | Type | Required | Notes |
|---|---|---|---|
file | file | Yes | Audio upload. |
model | string | Yes | Bare model key (for example whisper-large-v3, whisper-large-v3-turbo, or voxtral-small-24b) or a compound route such as eu/privatemode/whisper-large-v3. |
region | string | No | Optional routing hint when model is a bare key. |
provider | string | No | Optional routing hint when model is a bare key. |
response_format | string | No | json (default), text, or verbose_json. |
language | string | No | ISO-639-1 hint such as en. Required for verbose_json. |
prompt | string | No | Optional upstream prompt hint. |
temperature | number | No | Optional provider hint. |
Audio transcriptions do not accept a client-provided failover array.
eu/privatemode/whisper-large-v3 when you need a fixed provider and region.auto/auto/... when you want the router to choose an eligible route automatically.tresor.requested_route reports the normalized route you asked for.tresor.routed_model reports the route that actually served the request.tresor.failover is true when the router had to switch away from the initially preferred route during automatic routing.tresor.requested_route and tresor.routed_model can differ even when tresor.failover is false.For the distinction between automatic route switching and client retries, see Routing failover and Retries and transient errors.
audio/flacaudio/m4aaudio/mp4audio/mpegaudio/oggaudio/wavaudio/webmaudio/x-m4aThese MIME types are accepted by the Tresor router. Actual compatibility is still provider- and model-specific, so an upstream route can reject a file that passed router validation.
For the broadest compatibility across transcription routes, prefer audio/wav or audio/mpeg. In current validation, Tinfoil whisper-large-v3-turbo accepted WAV input while rejecting an M4A upload upstream.
Known route-specific notes:
global/tinfoil/whisper-large-v3-turbo: prefer WAV or MP3; current validation accepted WAV while rejecting M4A upstream.global/tinfoil/voxtral-small-24b: provider docs advertise MP3 or WAV and up to 30 minutes for transcription, but provider-side size checks can still reject compressed uploads well below the router cap.eu/privatemode/whisper-large-v3: current provider docs advertise up to 50 MB per request.Default upload limit: 25 MiB per request.
curl https://api.tresor.co/v1/audio/transcriptions \
-H "Authorization: Bearer $TRESOR_API_KEY" \
-F "model=auto/auto/whisper-large-v3" \
-F "file=@./meeting.mp3;type=audio/mpeg" \
-F "response_format=json"
json{
"text": "Hello and welcome to the meeting.",
"language": "en",
"duration": 12.5,
"tresor": {
"receipt_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"requested_route": "auto/auto/whisper-large-v3",
"routed_model": "eu/privatemode/whisper-large-v3",
"failover": false,
"usage": {
"billing_unit": "audio_minute",
"audio_seconds": 12.5
}
}
}
| Field | Type | Description |
|---|---|---|
text | string | Normalized transcript text. |
language | string | Language hint, when available. |
duration | number | Duration in seconds, when available. |
segments | array | Present on verbose_json responses. |
tresor.receipt_id | string | Receipt identifier, omitted when receipts are disabled. |
tresor.requested_route | string | Normalized route the caller asked the router to use. |
tresor.routed_model | string | The actual route used by the router. |
tresor.failover | boolean | Whether routing failover switched the request away from the initially preferred route. |
tresor.usage.billing_unit | string | Billing unit for this route, such as audio_minute, audio_second, request, or token. |
tresor.usage.audio_seconds | number | Duration metadata when available. |
tresor.usage.prompt_tokens | integer | Present when the route settles transcription on token usage. |
tresor.usage.completion_tokens | integer | Present when the route settles transcription on token usage. |
Minute-priced transcription routes return billing_unit = audio_minute while still reporting clip duration in audio_seconds. Request-priced transcription routes return billing_unit = request. Token-priced transcription routes return billing_unit = token and may include prompt and completion token counts in tresor.usage.
textWhen response_format=text, the response body is plain text. Tresor metadata moves to response headers:
X-Tresor-Routed-ModelX-Tresor-Receipt-Id when receipts are enabledIf you need requested_route for a text response, keep the original request or fetch the signed receipt; there is no X-Tresor-Requested-Route header.
verbose_jsonAdds normalized segments[] timing information. language is currently required.
| Status | code | Meaning |
|---|---|---|
400 | invalid_content_type | Request was not multipart/form-data. |
400 | missing_field | Required field such as file, model, or language was missing. |
400 | invalid_model | Model is unknown or not a transcription model. |
400 | unsupported_file_type | Router validation or the selected route rejected the audio format. |
400 | invalid_file | The selected route rejected the uploaded file after router validation. |
400 | file_too_large | The selected route enforced a stricter size limit than the router. |
400 | invalid_response_format | Unsupported response_format. |
401 | unauthorized | Missing or invalid API key. |
402 | insufficient_balance | Balance too low for reservation. |
502 | upstream_error | Upstream provider failed for a non-file-specific reason. |
503 | route_not_priced | The route exists but pricing is not available. |
Error bodies follow the standard error envelope.