Core Concepts — Vector

Every type signature below is verbatim from proto/proto-k3/k3_vector.proto. Package: dodil.k3.vector.v1. Wire encodings: gRPC follows proto types directly; HTTP uses pbjson (camelCase, int64 as JSON strings, enums as wire-name strings).

Five types worth knowing: Engine, Collection, VectorRecord (for direct writes), SearchRequest / SearchResult (the heart of the service), and Template (the Scriptum catalog subset). Six enums govern shapes: SearchMode, SparseMode, EmbeddingType, DistanceMetric, FilterOp, LogicalOp.


   bucket ──┬── Engine (1:1, VBase connection — auto / external / pick)
            │
            └── Collections (kind='vector' store_entities)
                  │
                  ├── pipeline-mode  → embedding_source = PIPELINE  (Scriptum template owns schema)
                  └── manual-mode    → embedding_source = EXTERNAL  (you push VectorRecords)
                                                  │
                                                  ▼
                                            Search RPC
                                  (text | vector | s3_key,
                                   single | multi-collection,
                                   vector | hybrid | auto,
                                   optional Jina rerank)

`Engine`

Per-bucket VBase / Milvus connection. Created via ConfigureEngine in one of three modes.

HTTP


{
  "engineId": "eng_a1b2…",
  "bucket": "kb-prod",
  "mode": "auto",
  "status": "ENGINE_STATUS_ACTIVE",
  "vbaseEndpoint": "vbase-us-east-1.dodil.io",
  "vbasePort": 19530,
  "vbaseDbName": "kb-prod-vec",
  "serviceId": "vb_svc_a1b2…",
  "serviceAccountId": "sa_a1b2…",
  "collections": [],
  "createdAt": "1716840000000",
  "updatedAt": "1716843600000"
}

Three modes:

Mode	What it does	Required inputs
`auto`	K3 provisions a fresh VBase database on a managed cluster + an IAM service account	none
`external`	Point the engine at your own VBase	`vbase_endpoint`, `vbase_port`, `vbase_db_name`
`pick`	Reuse an existing VBase service in your org	`service_id` (from `ListVBaseInstances`)

For raw Milvus features K3 doesn’t expose, use external mode + talk to VBase directly with VBase’s own API or any Milvus client. See VBase docs .

`Collection`

The unit of search. Has a schema (dimensions / metric / sparse mode / embedding type), an embedding source (PIPELINE or EXTERNAL), and a status.

HTTP


{
  "collectionId": "col_a1b2…",
  "bucket": "kb-prod",
  "engineId": "eng_a1b2…",
  "name": "docs",
  "description": "PDF / docx / HTML embeddings",
  "status": "COLLECTION_STATUS_ACTIVE",
  "embedPipelineName": "vector-docs-text_embedding_index",
  "embedPipelineId": "pipe_a1b2…",
  "dimensions": 1024,
  "distanceMetric": "DISTANCE_METRIC_COSINE",
  "milvusCollection": "kb_prod_docs_a1b2",
  "chunkSize": 512,
  "chunkOverlap": 50,
  "enableBm25": true,
  "embeddingType": "EMBEDDING_TYPE_FLOAT",
  "modality": "text",
  "createdAt": "1716840000000",
  "updatedAt": "1716843600000",
  "embeddingSource": "EMBEDDING_SOURCE_PIPELINE",
  "sparseMode": "SPARSE_MODE_BM25",
  "templateId": "text_embedding_index",
  "embedModel": "jina-embeddings-v4",
  "templateInputs": {}
}

Two creation modes (equal weight)

	Pipeline-mode	Manual / external
RPC	`AddVectorPipeline`	`AddVectorCollection`
Who owns the schema	Scriptum template’s `ScriptContract`	You (caller)
`embedding_source`	`PIPELINE`	`EXTERNAL`
Caller can override dimensions / metric / sparse mode	No (template is source of truth)	Yes — explicit on the create call
Vectors come from	Scriptum embedding step on each ingest job	`InsertVectors` / `UpsertVectors` from your code
Milvus collection materializes	Lazy — on first ingest	Eager — at create time
Ingest rule auto-created	Yes (globs from `acceptedExtensions`)	No
Use when	Unstructured corpus (PDFs, images, audio) — let K3 handle embedding	You have your own embedding pipeline or third-party model

Sparse modes


enum SparseMode {
  SPARSE_MODE_UNSPECIFIED = 0;                // → derive from enable_bm25 (back-compat)
  SPARSE_MODE_NONE = 1;                       // dense only
  SPARSE_MODE_BM25 = 2;                       // text → sparse via Milvus BM25 function
  SPARSE_MODE_EXTERNAL = 3;                   // caller supplies sparse vectors directly
}

NONE — dense-only collection. HYBRID search modes fall through to dense.
BM25 — Milvus computes the sparse vector from a text field on each row. Query-side: K3 sends the query text through BM25 and fuses with the dense search via RRF (k=60).
EXTERNAL — caller supplies SparseVector on each VectorRecord (and on query input). Useful when you have your own learned-sparse model.

Embedding types


enum EmbeddingType {
  EMBEDDING_TYPE_UNSPECIFIED = 0;             // → FLOAT (default)
  EMBEDDING_TYPE_FLOAT       = 1;             // FloatVector
  EMBEDDING_TYPE_BINARY      = 2;             // BinaryVector — dimensions % 8 == 0
  EMBEDDING_TYPE_FLOAT16     = 3;             // Float16Vector
  EMBEDDING_TYPE_BFLOAT16    = 4;             // BFloat16Vector
  EMBEDDING_TYPE_INT8        = 5;             // Int8Vector
}

Immutable for the collection’s lifetime. Pick based on what your embedding model produces; many production models output FLOAT (1024–3072 dims), some use INT8 for compactness, BINARY for hash-style coding (e.g. SimHash).

Distance metrics


enum DistanceMetric {
  DISTANCE_METRIC_UNSPECIFIED = 0;            // → COSINE (default)
  DISTANCE_METRIC_COSINE      = 1;            // Milvus: COSINE
  DISTANCE_METRIC_EUCLIDEAN   = 2;            // Milvus: L2
  DISTANCE_METRIC_DOT_PRODUCT = 3;            // Milvus: IP
  DISTANCE_METRIC_HAMMING     = 4;            // BINARY embedding type only
  DISTANCE_METRIC_JACCARD     = 5;            // BINARY embedding type only
}

K3 accepts canonical lowercase short names (cosine, euclidean, …) on the wire and maps to Milvus’s uppercase names (COSINE, L2, IP) on collection creation.

`VectorRecord` — for direct writes

The shape InsertVectors / UpsertVectors accept. EXTERNAL-mode collections only — pipeline-mode collections reject these calls (FAILED_PRECONDITION).

HTTP


{
  "id": "doc-1",
  "denseFloat": { "values": [0.12, -0.04, 0.91, /* ... 1021 more */] },
  "sparse": {
    "indices": [42, 1337, 9001],
    "values":  [0.8, 0.3, 0.1]
  },
  "text": "Original chunk text used for BM25 hybrid search.",
  "metadata": {
    "source": "papers/attention.pdf",
    "page": 4,
    "tags": ["transformer", "attention"]
  }
}

Key facts:

id is caller-supplied — pick something stable. UpsertVectors matches on this PK.
The dense oneof variant must match the collection’s embedding_type. Mismatches return INVALID_ARGUMENT.
text is only consumed when the collection’s sparse_mode is BM25 — feeds the Milvus BM25 function.
sparse is only accepted when sparse_mode = EXTERNAL; rejected otherwise.
metadata is a google.protobuf.Struct — use real JSON types (numbers / bools / arrays), not just strings. The search-side pre-filter operates on these fields.

`SearchRequest` / `SearchResult` — the heart of the service

One RPC with three query shapes and rich tuning. The full anatomy lives at API Reference → Search; here are the type signatures.

HTTP


{
  "bucket": "kb-prod",
  "text": "what is multi-head attention",
  "topK": 10,
  "searchMode": "SEARCH_MODE_AUTO",
  "rerank": true,
  "includeContent": true,
  "preFilter": {
    "op": "LOGICAL_OP_AND",
    "filters": [
      { "field": "source_key", "op": "FILTER_OP_CONTAINS", "value": "papers/" },
      { "field": "page",       "op": "FILTER_OP_GTE",      "value": "2" }
    ]
  }
}

gRPC


message SearchRequest {
  string bucket = 1;
  oneof query {
    string text = 2;                          // server-side embed via collection's embed_model
    VectorInput vector = 3;                   // pre-embedded fast lane (bypasses Scriptum)
    string s3_key = 4;                        // multimodal file query
  }
  optional string content_type = 5;           // hint for file queries
  optional FilterGroup pre_filter = 6;
  repeated string source_ids = 7;
  int32 top_k = 8;                            // default 10
  float min_score = 9;
  optional string collection_name = 10;       // empty = ALL matching collections
  SearchMode search_mode = 11;
  bool rerank = 12;                           // Jina reranker via Ignite
  optional string rerank_text = 13;           // override for binary queries
  bool include_content = 14;
  bool include_highlights = 15;
}
 
message SearchResponse {
  repeated SearchResult results = 1;
  dodil.k3.common.v1.PaginationResponse pagination = 2;
  int64 took_ms = 3;
  string search_mode_used = 4;                // "vector" | "hybrid"
  repeated string warnings = 5;
  repeated CollectionSearchStatus collection_statuses = 6;
}
 
message SearchResult {
  dodil.k3.common.v1.ObjectRef object = 1;
  float score = 2;
  map<string, string> metadata = 3;
  SearchResultSource source = 4;              // VECTOR | FULLTEXT | HYBRID
  optional string chunk_id = 5;
  optional int32 chunk_index = 6;
  optional string content = 7;                // when include_content=true
  optional string highlight = 8;              // when include_highlights=true
  // For batch vector queries: the input-query index this result belongs to
  optional int32 query_index = 9;
}
 
enum SearchMode {
  SEARCH_MODE_UNSPECIFIED = 0;                // → VECTOR
  SEARCH_MODE_VECTOR = 1;                     // dense only
  SEARCH_MODE_HYBRID = 2;                     // dense + BM25, RRF k=60
  SEARCH_MODE_AUTO = 3;                       // HYBRID where collection has sparse, else VECTOR
}

Three query shapes (oneof):

Shape	What it does
`text`	K3 routes the string through the collection’s `embed_model` (Ignite embedding service), then searches
`vector` (VectorInput)	Pre-embedded fast lane — caller supplies the vector(s); K3 goes straight to Milvus. Supports batch (`vectors[]`) — one Milvus round-trip for N queries; results carry `query_index`.
`s3_key`	Multimodal — K3 fetches the object, embeds it server-side (image / audio / video), then searches

Multi-collection: leave collection_name empty to search all matching collections in the bucket. K3 groups by (dimensions, embedding_type) — collections sharing dimensions but different embed_model never co-mingle. Per-collection observability comes back as collection_statuses[].

For the full anatomy (every knob, every mode, worked examples, multi-collection compatibility rules, fast-lane Milvus tuning), see API Reference → Search — that’s the centerpiece page.

`FilterGroup` — pre-filter metadata

The metadata pre-filter is the same shape across vector queries.

HTTP


{
  "op": "LOGICAL_OP_AND",
  "filters": [
    { "field": "source_key", "op": "FILTER_OP_CONTAINS", "value": "papers/" }
  ],
  "groups": [
    {
      "op": "LOGICAL_OP_OR",
      "filters": [
        { "field": "tags", "op": "FILTER_OP_IN", "value": "transformer,attention" },
        { "field": "year", "op": "FILTER_OP_GTE", "value": "2017" }
      ]
    }
  ]
}

Key facts:

Filter the metadata fields on records — what you set in VectorRecord.metadata (manual) or what the Scriptum template emits (pipeline).
FILTER_OP_IN uses comma-separated strings: value: "transformer,attention".
FilterGroup.groups[] lets you nest — ((a AND b) OR (c AND d)) patterns are expressible.
Filters run before vector / sparse retrieval — Milvus pushes them into its scalar-index scan.

`Template`

The Scriptum embedding-template catalog, filtered to vector-compatible templates (server pins category=embedding).


message Template {
  string id = 1;
  string name = 2;
  string description = 3;
  repeated string tags = 4;
  repeated string tools_required = 5;
  map<string, string> labels = 6;
  string category = 7;                        // always "embedding" for vector
  repeated string modalities = 8;             // ["text", "image", "audio"]
  repeated string accepted_extensions = 9;
  repeated string accepted_content_types = 10;
  dodil.k3.common.v1.ScriptContract contract = 11;   // typed I/O contract — see Pipelines → Concepts → ScriptContract
}

contract is the same ScriptContract every pillar carries — its inputs is what AddVectorPipeline.template_inputs is validated against.

K3 ships these *_embedding_index templates (paired with *_embedding_search for query-side):

Template	Modality	Use case
`text_embedding_index`	text, pdf, docx, html, audio, video	The canonical RAG ingest template
`code_embedding_index`	code (rust, python, javascript, go, …)	Source-code embedding with AST-aware chunking
`visual_embedding_index`	image, video, audio, pdf	Multimodal — image / video frames / audio spectrograms / PDF page renders
`face_embedding_index`	image	SCRFD detection + face embeddings
`object_embedding_index`	image	Open-vocabulary object detection embeddings

Full descriptions live at Pipelines → Templates → The catalog.

When ingest runs (pipeline-mode collections)

Same trigger model as Pipelines:

Direct upload of a matching object via S3 → Scriptum template runs → vectors land in the collection
Source sync (Preview) — discovered objects → template runs → vectors land
One-shot manual via TriggerIngest → forces a single object through the bound pipeline

The auto-generated ingest rule (created with AddVectorPipeline) is a normal IngestRule — list / inspect / pause / re-bind via the Pipelines APIs / CLI.

When to use direct writes (EXTERNAL collections)

InsertVectors / UpsertVectors / DeleteVectors accept VectorRecord[]. Use cases:

BYO embedding model — you produce vectors with a model K3 doesn’t host (or a custom-fine-tuned one)
Third-party embedding service — OpenAI, Cohere, Voyage, etc., on the caller side
Pre-computed batches — bulk-loaded from another system
Learned-sparse models — set sparse_mode = EXTERNAL and supply sparse vectors with each record

Core Concepts — Vector

Engine

HTTP

gRPC

Collection

HTTP

gRPC

Two creation modes (equal weight)

Sparse modes

Embedding types

Distance metrics

VectorRecord — for direct writes

HTTP

gRPC

SearchRequest / SearchResult — the heart of the service

HTTP

gRPC

FilterGroup — pre-filter metadata

HTTP

gRPC

Template

When ingest runs (pipeline-mode collections)

When to use direct writes (EXTERNAL collections)

See also

`Engine`

`Collection`

`VectorRecord` — for direct writes

`SearchRequest` / `SearchResult` — the heart of the service

`FilterGroup` — pre-filter metadata

`Template`