Skip to Content
We are live but in Staging 🎉
ModelsAPI ReferenceInfer (generic)

Infer (generic) — API Reference

Package: dodil.ignite.v1 · Service: ModelService

Infer is the generic inference surface for models that don’t map onto a dedicated OpenAI/Cohere request shape — classification, object detection, OCR, named-entity recognition, and similar tasks. Instead of a fixed schema, you send an opaque payload and the model returns an opaque result. The HTTP path is the top-level /v1/infer (not under /v1/ignite/).

To learn the exact payload and result shape for a given model, call GetModel and read its input / output schema; the ModelInfo.input.format field tells you which RPC a model expects. Browse the Model Catalog for the human-readable list, and see Using OpenAI & Cohere SDKs for the chat/embeddings/rerank/transcription surfaces.

RPCHTTPstreaming
InferPOST /v1/inferunary
StreamInferPOST /v1/infer/streamserver-stream

gRPC reaches both methods at dodil.ignite.v1.ModelService/<Method> on $IGNITE_GRPC. See Conventions → Using gRPC for grpcurl setup.

Infer

Request

payload carries the model’s input. It is a bytes field, so over JSON (both the HTTP gateway and grpcurl) it is base64-encoded — base64 your JSON document, image, or audio according to the model’s input schema, and set content_type to match. content_type defaults to application/json; use image/png, audio/wav, etc. for binary payloads.

# Classify text — payload is a JSON document matching the model's input schema curl -sS -X POST "https://api.dev.dodil.io/v1/infer" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"text-classifier-v1\", \"payload\": \"$(echo -n '{"text":"This product exceeded my expectations!"}' | base64)\", \"content_type\": \"application/json\" }"

Response

result is the model’s raw output, again as bytes (base64 over JSON); decode it according to content_type and the model’s output schema.

{ "model": "text-classifier-v1", "result": "eyJsYWJlbCI6InBvc2l0aXZlIiwic2NvcmUiOjAuOTk0fQ==", "content_type": "application/json", "usage": { "execution_ms": 38, "resource_tier": "small" } }

The result above decodes to {"label":"positive","score":0.994}.

StreamInfer

Server-streaming variant for models that emit incremental output. Emits a sequence of InferChunk messages; concatenate each data fragment until a chunk reports done: true. The dedicated HTTP path is /v1/infer/stream.

Request

curl -sS -N -X POST "https://api.dev.dodil.io/v1/infer/stream" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"ocr-stream-v1\", \"payload\": \"$(base64 -i scan.png)\", \"content_type\": \"image/png\" }"

Response

A stream of InferChunk messages. Accumulate data (base64 over JSON) across chunks; the chunk with done: true carries the final usage.

{ "data": "UGFnZSAx", "done": false } { "data": "IGNvbnRlbnQ=", "done": false } { "data": "", "done": true, "usage": { "execution_ms": 512, "resource_tier": "medium" } }

See also