Create batch job (OpenAI format)
/openai/v1/batchesCreates a batch processing job.
Note: This endpoint also works without the /v1 prefix (e.g., /openai/batches).
Request Body
application/json
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/json
application/json
curl -X POST "http://localhost:8080/openai/v1/batches" \ -H "Content-Type: application/json" \ -d '{ "model": "string" }'{
"id": "string",
"object": "string",
"endpoint": "string",
"input_file_id": "string",
"completion_window": "string",
"status": "validating",
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0,
"succeeded": 0,
"expired": 0,
"canceled": 0,
"pending": 0
},
"metadata": {
"property1": "string",
"property2": "string"
},
"created_at": 0,
"expires_at": 0,
"output_file_id": "string",
"error_file_id": "string",
"processing_status": "string",
"results_url": "string",
"operation_name": "string",
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 0,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 0,
"similarity": 0
}
}
}{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}Count input tokens POST
Counts the number of tokens in a Responses API request.
Create chat completion (OpenAI format) POST
Creates a chat completion using OpenAI-compatible format. Supports streaming via SSE. Async inference: Send x-bf-async: true to submit the request as a background job and receive a job ID immediately. Poll with x-bf-async-id: <job-id> to retrieve the result. When the job is still processing, the response will have an empty choices array. When completed, choices will contain the full result. See Async Inference for details. Note: This endpoint also works without the /v1 prefix (e.g., /openai/chat/completions).