Skip to Content
We are live but in Staging 🎉

Collections — API Reference

Package: dodil.k3.vector.v1 · Service: VectorService

Collections are bucket-scoped Milvus collections managed by K3. Two creation modesAddVectorPipeline (template-driven, schema lazy-materialized, embedding_source = PIPELINE) and AddVectorCollection (caller-managed schema, embedding_source = EXTERNAL). See Core Concepts → Collection.

RPCHTTP
AddVectorPipelinePOST /:bucket/vector/pipelines
AddVectorCollectionPOST /:bucket/vector/collections
ListCollectionsGET /:bucket/vector/collections
GetCollectionGET /:bucket/vector/collections/:collection_id
DeleteCollectionDELETE /:bucket/vector/collections/:collection_id

gRPC setup — grpcurl, endpoints, reflection, and field-name casing — is covered once in Conventions → Using gRPC.

AddVectorPipeline — pipeline-mode

Template-driven. K3 spawns the index + search Scriptum scripts; the index template’s vector_store_ensure_schema creates the Milvus collection lazily on first ingest. Schema-shaping facts (dimensions, embed_model, distance, sparse_mode, embedding_type) come from the template’s ScriptContract — callers cannot override them.

Request

Text embedding (no required template inputs):

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/pipelines" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "docs", "description": "PDF / docx / HTML embeddings", "templateId": "text_embedding_index" }'

Object detection embedding (requires labels per the template’s ScriptContract):

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/pipelines" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "products", "description": "Product image object detection", "templateId": "object_embedding_index", "templateInputs": { "labels": ["bottle", "bag", "shoe", "watch", "jewelry"] } }'

Response

A Collection row with embedding_source = PIPELINE — see Core Concepts → Collection.

What K3 atomically creates on AddVectorPipeline:

  1. The collection row in K3 (Milvus collection materializes lazily on first ingest)
  2. A Scriptum pipeline bound to the index template
  3. An auto-generated ingest rule whose globs derive from the template’s acceptedExtensions

List / inspect / pause the auto-rule via dodil k3 ingest list -p $PIPELINE_ID.

AddVectorCollection — manual / external mode

Caller-managed schema. K3 creates the Milvus collection up-front from these knobs, no Scriptum binding. Caller drives ingestion through InsertVectors / UpsertVectors / DeleteVectors. The collection is recorded with embedding_source = EXTERNAL.

Request

Dense-only, OpenAI-shaped (1536 dims):

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/collections" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "ada-embeddings", "description": "Pre-computed OpenAI ada-002 embeddings", "dimensions": 1536, "distanceMetric": "DISTANCE_METRIC_COSINE", "embeddingType": "EMBEDDING_TYPE_FLOAT", "sparseMode": "SPARSE_MODE_NONE", "embedModel": "openai/text-embedding-ada-002" }'

Hybrid with BM25 (Milvus computes sparse from each row’s text):

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/collections" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "hybrid-text", "description": "Dense + BM25 hybrid; caller supplies dense + raw text", "dimensions": 1024, "distanceMetric": "DISTANCE_METRIC_COSINE", "embeddingType": "EMBEDDING_TYPE_FLOAT", "sparseMode": "SPARSE_MODE_BM25" }'

External sparse — caller supplies both dense + sparse vectors:

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/collections" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "splade-vectors", "description": "Dense + caller-supplied SPLADE sparse vectors", "dimensions": 768, "distanceMetric": "DISTANCE_METRIC_COSINE", "embeddingType": "EMBEDDING_TYPE_FLOAT", "sparseMode": "SPARSE_MODE_EXTERNAL" }'

Compact binary embedding (e.g. hash-style):

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/vector/collections" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "name": "binary-hashes", "description": "256-bit perceptual hashes (32 bytes per vector)", "dimensions": 256, "distanceMetric": "DISTANCE_METRIC_HAMMING", "embeddingType": "EMBEDDING_TYPE_BINARY" }'

Response

A Collection row with embedding_source = EXTERNAL — see Core Concepts → Collection.

Constraints:

  • EMBEDDING_TYPE_BINARY requires dimensions % 8 == 0 (vector is bit-packed into dimensions / 8 bytes)
  • SPARSE_MODE_BM25 requires embedding_type = FLOAT
  • embed_model left empty → text / file queries on this collection return an error; only the pre-embedded vector query path works

ListCollections

Request

curl -sS "https://k3.dev.dodil.io/kb-prod/vector/collections" \ -H "Authorization: Bearer $DODIL_TOKEN"

Response

{ "collections": [ { "collectionId": "col_a1b2...", "bucket": "kb-prod", "engineId": "eng_a1b2...", "name": "docs", "status": "COLLECTION_STATUS_ACTIVE", "dimensions": 1024, "distanceMetric": "DISTANCE_METRIC_COSINE", "embeddingType": "EMBEDDING_TYPE_FLOAT", "sparseMode": "SPARSE_MODE_BM25", "embeddingSource": "EMBEDDING_SOURCE_PIPELINE", "templateId": "text_embedding_index", "embedModel": "jina-embeddings-v4", "modality": "text" }, { "collectionId": "col_c3d4...", "bucket": "kb-prod", "engineId": "eng_a1b2...", "name": "ada-embeddings", "status": "COLLECTION_STATUS_ACTIVE", "dimensions": 1536, "distanceMetric": "DISTANCE_METRIC_COSINE", "embeddingType": "EMBEDDING_TYPE_FLOAT", "sparseMode": "SPARSE_MODE_NONE", "embeddingSource": "EMBEDDING_SOURCE_EXTERNAL", "embedModel": "openai/text-embedding-ada-002", "modality": "" } ] }

GetCollection

Request

curl -sS "https://k3.dev.dodil.io/kb-prod/vector/collections/col_a1b2..." \ -H "Authorization: Bearer $DODIL_TOKEN"

Response

A Collection row — see Core Concepts → Collection.

DeleteCollection

Removes the K3 collection row + the underlying Milvus collection. For pipeline-mode collections, does not cascade to the bound Scriptum pipeline or the auto-generated ingest rule — clean those up separately via the Pipelines API.

Request

curl -sS -X DELETE "https://k3.dev.dodil.io/kb-prod/vector/collections/col_a1b2..." \ -H "Authorization: Bearer $DODIL_TOKEN"

Response

Empty (DeleteCollectionResponse {}).


See also