We are live but in Staging 🎉

Workflow: Run Model Inference

Last validated: 2026-05-20

Goal

Use Ignite model endpoints for list/get/infer operations, with API fallback for full-control chat/rerank/transcribe flows.

Inputs

Token
API endpoint
Model name

CLI Path (Current)


# List and inspect models
dodil ignite models list
dodil ignite models get kimi-k2-5
 
# Embeddings
dodil ignite models embed kimi-k2-5 --input "hello world"
 
# Generic infer
dodil ignite models infer kimi-k2-5 --prompt '{"input":"hello"}'

API Fallback for Full Request Control

Chat completion


curl -sS https://api.dev.dodil.io/v1/chat/completions \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "model":"kimi-k2-5",
    "messages":[{"role":"user","content":"Summarize this text"}]
  }'

Rerank


curl -sS https://api.dev.dodil.io/v1/rerank \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "model":"bge-reranker-v2",
    "query":"ignite gateway architecture",
    "documents":["doc A","doc B"]
  }'

Transcribe

Use Transcribe API directly for real audio payload control in automation environments.

Current CLI Caveats

CLI has no first-class rerank command.
models chat and models transcribe currently have implementation ergonomics gaps for production use.
Prefer API calls when request structure fidelity matters (tools, response_format, advanced options).

Workflow: Build a Custom Image and Use It Feature Status