Infer (generic) — API Reference

Package: dodil.ignite.v1 · Service: ModelService

Infer is the generic inference surface for models that don’t map onto a dedicated OpenAI/Cohere request shape — classification, object detection, OCR, named-entity recognition, and similar tasks. Instead of a fixed schema, you send an opaque payload and the model returns an opaque result. The HTTP path is the top-level /v1/infer (not under /v1/ignite/).

To learn the exact payload and result shape for a given model, call GetModel and read its input / output schema; the ModelInfo.input.format field tells you which RPC a model expects. Browse the Model Catalog for the human-readable list, and see Using OpenAI & Cohere SDKs for the chat/embeddings/rerank/transcription surfaces.

RPC	HTTP	streaming
`Infer`	`POST /v1/infer`	unary
`StreamInfer`	`POST /v1/infer/stream`	server-stream

gRPC reaches both methods at dodil.ignite.v1.ModelService/<Method> on $IGNITE_GRPC. See Conventions → Using gRPC for grpcurl setup.

`Infer`

Request

payload carries the model’s input. It is a bytes field, so over JSON (both the HTTP gateway and grpcurl) it is base64-encoded — base64 your JSON document, image, or audio according to the model’s input schema, and set content_type to match. content_type defaults to application/json; use image/png, audio/wav, etc. for binary payloads.

HTTP


# Classify text — payload is a JSON document matching the model's input schema
curl -sS -X POST "https://api.dev.dodil.io/v1/infer" \
  -H "Authorization: Bearer $DODIL_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"text-classifier-v1\",
    \"payload\": \"$(echo -n '{"text":"This product exceeded my expectations!"}' | base64)\",
    \"content_type\": \"application/json\"
  }"

Response

result is the model’s raw output, again as bytes (base64 over JSON); decode it according to content_type and the model’s output schema.

HTTP


{
  "model": "text-classifier-v1",
  "result": "eyJsYWJlbCI6InBvc2l0aXZlIiwic2NvcmUiOjAuOTk0fQ==",
  "content_type": "application/json",
  "usage": { "execution_ms": 38, "resource_tier": "small" }
}

The result above decodes to {"label":"positive","score":0.994}.

`StreamInfer`

Server-streaming variant for models that emit incremental output. Emits a sequence of InferChunk messages; concatenate each data fragment until a chunk reports done: true. The dedicated HTTP path is /v1/infer/stream.

Request

HTTP


curl -sS -N -X POST "https://api.dev.dodil.io/v1/infer/stream" \
  -H "Authorization: Bearer $DODIL_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"ocr-stream-v1\",
    \"payload\": \"$(base64 -i scan.png)\",
    \"content_type\": \"image/png\"
  }"

Response

A stream of InferChunk messages. Accumulate data (base64 over JSON) across chunks; the chunk with done: true carries the final usage.


{ "data": "UGFnZSAx", "done": false }
{ "data": "IGNvbnRlbnQ=", "done": false }
{ "data": "", "done": true, "usage": { "execution_ms": 512, "resource_tier": "medium" } }

Infer (generic) — API Reference

Infer

Request

HTTP

gRPC

Response

HTTP

gRPC

StreamInfer

Request

HTTP

gRPC

Response

See also

`Infer`

`StreamInfer`