K3 Service Overview
Last validated: 2026-05-11
K3 is an augmented S3 platform. It keeps standard object storage semantics, then adds ingestion pipelines and query services so data can be searched and processed as:
- Vector data (semantic and hybrid search)
- Structured table data (SQL and HTAP workflows)
- Source-driven ingestion jobs (discovery, indexing, re-ingestion)
K3 currently runs as two binaries:
k3-api: HTTP + gRPC control/data planek3-worker: asynchronous execution plane (NATS consumers, schedulers, maintenance)
Core Concepts
Bucket
A bucket is both a storage namespace and a knowledge namespace. Most K3 APIs are bucket scoped.
Source
A source defines where content is synced from (internal S3 or external provider connectors).
Pipeline
A pipeline defines what processing should run (Scriptum template/script + options + optional destination entity).
Rule
A rule binds source matching conditions to a pipeline. Rules trigger discovery and ingestion behavior.
Vector Engine and Collections
The vector engine is the per-bucket vector backend configuration. Collections are the searchable vector datasets.
Table Engine and Tables
The table engine is the per-bucket HTAP/tables backend configuration. Tables are structured datasets supporting query and maintenance operations.
Ingest Job
An ingest job is one pipeline execution instance over one object (manual trigger or discovery-triggered).
Service Decomposition
K3 API is split into six service domains (not a single monolithic K3 service):
- StorageService
- SourceService
- PipelineService
- IngestService
- VectorService
- TableService
This domain split is reflected in:
- Proto package namespaces (
dodil.k3.<domain>.v1) - HTTP route grouping under
bin/api/src/http/api/mod.rs - gRPC registration in
bin/api/src/main.rs
Runtime Data Flow
Write and ingest path
- Client uploads data through S3-compatible proxy or other source path.
- K3 publishes discovery/ingest events to NATS JetStream.
- Worker consumes events and invokes Scriptum-driven processing.
- Outputs are persisted in vector collections, table backends, and/or job records.
Query path
- Client calls vector or table APIs.
- API routes request to corresponding service handlers.
- Service executes backend operations (Milvus, HTAP/Ignite, metadata DB, S3).
What Is Live vs Planned
Live, code-validated surfaces:
- Storage and bucket management
- Source and credential management
- Pipeline and ingest rule/job workflows
- Vector engine/collections/search and external vector writes
- Table engine/tables/query/maintenance/execute flows
Design docs in the K3 repo also describe future or evolving capabilities. Treat those as roadmap unless confirmed in code paths and active routes.
For exact feature coverage and mismatches, see:
docs/07-feature-status.md