Skip to Content
We are live but in Staging 🎉
RecipesRAG Knowledge Base

RAG Knowledge Base

Goal: stand up a working RAG (retrieval-augmented generation) corpus on K3. Upload PDFs / docs / text → K3 chunks + embeds + indexes them automatically → query with hybrid search + rerank.

Primitives used: Storage (the bucket + S3 upload) → Pipelines (the auto-generated rule wired by Vector at collection-create time) → Vector (the collection + search).

Shape:

┌──────────┐ │ Your │ │ documents│ └────┬─────┘ │ aws s3 cp / dodil k3 object create ┌──────────────────────────────────────────────────┐ │ Storage — kb-platform bucket │ └──────────────────────────────────────────────────┘ │ auto-rule fires (globs from template's acceptedExtensions) ┌──────────────────────────────────────────────────┐ │ Pipelines — text_embedding_index Scriptum │ │ runs per uploaded object → chunks + embeds │ └──────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────┐ │ Vector — `docs` collection │ │ (Milvus, pipeline-mode, BM25 enabled) │ └──────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────┐ │ App layer: Search RPC → top-K chunks → LLM │ └──────────────────────────────────────────────────┘

Prerequisites

1. Create the bucket + configure the vector engine

# Storage primitive — create the bucket dodil k3 bucket create kb-platform -d "Production RAG knowledge base" # Vector primitive — configure engine (auto mode = K3 provisions VBase) dodil k3 vector store create -b kb-platform -m auto # Wait for engine ACTIVE (typically < 60 s) until dodil k3 vector store get -b kb-platform -o json | jq -e '.status == "ENGINE_STATUS_ACTIVE"' > /dev/null; do echo " engine status: $(dodil k3 vector store get -b kb-platform -o json | jq -r .status) — waiting..." sleep 5 done echo "✅ engine ACTIVE"

Tables engine is auto-enabled on every bucket too — dodil k3 bucket create wires both Storage entitlements and the Tables engine. You don’t need to enable it explicitly. Only the Vector engine needs an explicit vector store create.

2. Pick a template + create the collection

# Browse vector-pillar templates dodil k3 vector templates -o json | jq '.templates[] | {id, modalities, acceptedExtensions}'

For PDF + docx + HTML + plain text, pick text_embedding_index. Inspect its contract to confirm no required runtime inputs:

dodil k3 template get text_embedding_index -o json | jq '.contract.inputs'

Create a pipeline-mode collection — K3 atomically creates the collection + a Scriptum pipeline + an auto-generated ingest rule:

dodil k3 vector collection add docs -b kb-platform \ --description "Production RAG corpus" \ --template text_embedding_index export COLLECTION_ID=$(dodil k3 vector collection get docs -b kb-platform -o json | jq -r '.collectionId') export PIPELINE_ID=$(dodil k3 vector collection get docs -b kb-platform -o json | jq -r '.embedPipelineId')

Confirm the auto-rule is enabled:

dodil k3 ingest list -b kb-platform -p "$PIPELINE_ID" -o json \ | jq '.rules[] | {ruleId, name, includePatterns, enabled}'

Expect something like includePatterns: ["**/*.pdf", "**/*.txt", "**/*.docx", "**/*.html"] and enabled: true.

3. Upload documents — three ways

Via CLI (one-off):

curl -sSL https://arxiv.org/pdf/1706.03762.pdf -o attention.pdf dodil k3 object create ./attention.pdf -b kb-platform -k papers/attention.pdf

Via aws-cli (bulk, S3-style — works because K3 speaks native S3):

# One-time setup aws configure --profile dodil-k3 # AWS Access Key ID: <your Dodil service-account ID> # AWS Secret Access Key: <your Dodil service-account secret> # Default region name: us-east-1 # Default output format: json # Bulk-upload an existing folder aws s3 sync ./my-docs/ s3://kb-platform/papers/ \ --endpoint-url https://k3.dev.dodil.io \ --profile dodil-k3

Via boto3 / @aws-sdk/client-s3 — same S3 SDKs, same auth. See Storage → S3 Compatibility for setup snippets.

Every upload (CLI or S3 SDK) fires through the auto-generated rule → spawns an ingest job → runs text_embedding_index → writes chunks + embeddings to the docs collection.

4. Watch the ingest pipeline

dodil k3 ingest jobs -b kb-platform -p "$PIPELINE_ID" -o json \ | jq '.jobs[] | {object: .object.key, status, chunksCreated, embeddingsWritten}'

Status path: PENDING → PROCESSING → COMPLETED. Happy path: chunksCreated == embeddingsWritten. If embeddingsWritten is lower, see Pipelines → Replay & Retry.

5. Search — three variants of increasing sophistication

A. Quick text search (CLI)

dodil k3 search "what is multi-head attention" -b kb-platform -c docs -o json \ | jq '.results[] | {score, object: .object.key}'

Default freshness (eventual), default mode (VECTOR — dense only). Good enough for a smoke test.

B. Hybrid + rerank (via API — CLI gap)

For production RAG, you want hybrid (dense + BM25) + Jina rerank. Drop to the API:

curl -sS -X POST "https://k3.dev.dodil.io/kb-platform/vector/search" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-platform", "collectionName": "docs", "text": "what is multi-head attention", "topK": 5, "searchMode": "SEARCH_MODE_AUTO", "rerank": true, "includeContent": true }' | jq '{ searchModeUsed, tookMs, results: [.results[] | { score, object: .object.key, content: (.content | .[0:200] + "...") }] }'

SEARCH_MODE_AUTO picks hybrid because text_embedding_index enables BM25 by default. rerank: true adds Jina cross-encoder scoring over the top-K. includeContent: true returns the chunk text — what your LLM consumes downstream. See Vector → Hybrid + Rerank for the full benchmark + tier breakdown.

C. Filtered search — only papers, only post-2017

curl -sS -X POST "https://k3.dev.dodil.io/kb-platform/vector/search" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-platform", "collectionName": "docs", "text": "attention mechanism", "topK": 5, "searchMode": "SEARCH_MODE_AUTO", "rerank": true, "includeContent": true, "preFilter": { "op": "LOGICAL_OP_AND", "filters": [ { "field": "source_key", "op": "FILTER_OP_CONTAINS", "value": "papers/" } ] } }'

For the full pre-filter operator table, see Vector → Search → Pre-filter.

6. Wire into your application

A typical RAG loop in Python:

import requests, openai K3 = "https://k3.dev.dodil.io" HEADERS = { "Authorization": f"Bearer {os.environ['DODIL_TOKEN']}", "Content-Type": "application/json", } def rag_query(question: str) -> str: # 1. Retrieve top-5 chunks from K3 with hybrid + rerank r = requests.post( f"{K3}/kb-platform/vector/search", headers=HEADERS, json={ "bucket": "kb-platform", "collectionName": "docs", "text": question, "topK": 5, "searchMode": "SEARCH_MODE_AUTO", "rerank": True, "includeContent": True, }, ).json() # 2. Build the context chunks = [r["content"] for r in r["results"]] context = "\n\n---\n\n".join(chunks) # 3. Hand to your LLM completion = openai.OpenAI().chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system", "content": "Answer the question using the provided context. Cite sources by file name."}, {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}, ], ) return completion.choices[0].message.content print(rag_query("Explain self-attention"))

Drop-in Node / Go / Rust equivalents — the K3 HTTP API speaks pbjson; any HTTP client works.

7. Operational maintenance

Add new documents

Same upload commands — every new object hits the auto-generated rule → ingest job → vectors land in docs. No re-configuration needed.

Backfill after rule changes

Edit the rule to broaden coverage (e.g. add .md to includePatterns), then retroactively re-ingest objects that now match:

RULE_ID=$(dodil k3 ingest list -b kb-platform -p "$PIPELINE_ID" -o json | jq -r '.rules[0].ruleId') # Add .md to the include patterns dodil k3 ingest update "$RULE_ID" -b kb-platform \ --include "**/*.pdf" --include "**/*.docx" --include "**/*.txt" --include "**/*.md" # Re-discover the source (internal-S3) and dispatch ingestion for matched objects SOURCE_ID=$(curl -sS "https://k3.dev.dodil.io/kb-platform/sources" \ -H "Authorization: Bearer $DODIL_TOKEN" \ | jq -r '.sources[] | select(.name == "internal") | .sourceId') dodil k3 ingest trigger-discovery -b kb-platform -s "$SOURCE_ID" --full-sync

For replay of failed jobs specifically, see Pipelines → Replay & Retry.

Pause ingestion temporarily

# Disable the rule — uploads still happen, just no ingestion dodil k3 ingest update "$RULE_ID" -b kb-platform --enabled=false # Re-enable when ready dodil k3 ingest update "$RULE_ID" -b kb-platform --enabled=true

Inspect what’s indexed

# Count rows in the collection dodil k3 vector collection get docs -b kb-platform -o json \ | jq '{name, status, dimensions, embedModel, sparseMode}' # Per-object-key chunk status — via Storage's ObjectInfo dodil k3 object show papers/attention.pdf -b kb-platform -o json \ | jq '.pipelineStatuses[]' # one entry per rule that ran on this object

Common gotchas

SymptomCauseFix
vector collection add fails with “engine not active”Engine still provisioningWait for ENGINE_STATUS_ACTIVE (poll vector store get)
Upload succeeds but no ingest job spawnsObject path doesn’t match auto-rule globsList the rule’s includePatterns; remember **/*.pdf is recursive, *.pdf is not
Jobs COMPLETED but search returns nothingFirst-ingest Milvus index build still in progressWait 10–30 s after first ingest, then re-search
Search returns stale results after re-uploadVector index updates async after re-ingestEither delete + re-upload as a new key, or query with freshness (vector doesn’t have a freshness selector like Tables; rely on re-ingest semantics)
Different doc types yielding very different chunk countstext_embedding_index chunks by token count; long docs → many chunksAdjust chunk_size / chunk_overlap per-pipeline via dodil k3 pipeline update
Latency spikes after corpus grows past ~100K chunksHNSW ef default is top_k * 2 — too low for large corporaUse pre-embedded vector queries with vector.searchParams.ef = "256" — see Vector → Search → Worked example: tuned HNSW

Cleanup

# Pause ingestion first dodil k3 ingest update "$RULE_ID" -b kb-platform --enabled=false # Delete in this order (no cascade) dodil k3 ingest delete "$RULE_ID" -b kb-platform dodil k3 pipeline delete "$PIPELINE_ID" -b kb-platform dodil k3 vector collection delete "$COLLECTION_ID" -b kb-platform dodil k3 vector store delete -b kb-platform # (Optional) Drop the bucket + objects dodil k3 bucket delete kb-platform

See also