Customer-support assistant

The outcome

An assistant that answers questions from your own help docs, knowledge base, and past tickets — not a generic model guessing. Every answer is grounded in your content (retrieval-augmented), so it’s accurate and traceable, and it stays current: when a doc or ticket changes, the index updates and the next answer reflects it.

Point it at customers for self-serve deflection — fewer tickets reach a human — and at agents as a copilot that surfaces the right passage and a drafted reply in seconds, for faster resolution. Same content, same pipeline, two audiences.

What you build on Dodil

The architecture is the standard RAG loop: documents land in object storage, a pipeline chunks and embeds them on upload, questions are embedded and matched against the vector index, and a chat model writes a grounded answer from the retrieved passages. On Dodil each role maps to one product:

Object storage + auto-embedding pipelines + vector search — K3 : drop docs and tickets in a bucket, and K3 chunks, embeds, and indexes them for you. Want to own the schema and index instead? Drive a managed Milvus database directly with VBase .
Embeddings + chat — Ignite Models : OpenAI/Cohere-compatible inference for embedding both your content and incoming questions, and for generating the final answer.
Serving the endpoint (optional) — Ignite Compute : run the answer endpoint serverless, next to your data, with no servers to manage.

Why it’s faster and cheaper here

Without Dodil you’d assemble and operate: a vector DB vendor, an embeddings/LLM API, ingestion glue that re-chunks and re-embeds on every doc change, object storage, compute to serve it, and auth wired across all of them. Each is a separate account, integration, and bill — and the ingestion glue is code you own forever, since support content changes constantly.

On Dodil that collapses to one platform, one IAM token, pipelines that embed on upload automatically, and serverless compute. The pieces are built to fit, so there’s nothing to integrate between them.

Faster — ship in days, not weeks. Less code to write (no ingestion or embedding glue), and one auth instead of credentials wired across four to six services.
Cheaper — fewer vendors and fewer bills, no glue to maintain, no idle compute to pay for, and far less ops surface to monitor and keep in sync.

These wins are structural — they come from removing vendors, glue, and idle capacity, not from a pricing trick.

Build it

Follow the step-by-step build in RAG knowledge base. It covers both paths end to end: a CLI path where K3 handles ingest, chunking, and embedding for you, and a programmatic path (Python) that composes Ignite Models and VBase yourself for full control over schema, retrieval, and prompt.

Customer-support assistant

The outcome

What you build on Dodil

Why it’s faster and cheaper here

Build it

See also