Semantic search

VBase is Dodil’s managed vector database. It’s Milvus 2.6 under the hood and Milvus-SDK-compatible: a control plane allocates a database for your organization, and the data plane is Milvus — so you create collections, build indexes, insert vectors, and search with the standard Milvus tooling.

This example builds the same semantic search two ways:

Humans drive it from the terminal with the dodil vbase CLI.
Apps and agents drive it programmatically with the Milvus SDK (here, pymilvus), authenticating with an IAM access token.

Either way, VBase just allocates the database and authenticates you — everything else is plain Milvus.

What you’ll build

An allocated database (the control plane gives you an endpoint and a db_name).
A collection holding an id, a 768-dimensional dense vector, and some metadata.
An HNSW index on the vector field with the COSINE metric.
A handful of vectors inserted into it.
A semantic search that returns the nearest documents.

The CLI path is the fastest way to try it by hand; the Python path is what you ship into an app or agent.

The CLI way

The db commands are the control plane — they allocate the database and set your active connection context. The collection, index, and data commands then operate on Milvus directly through that context.

1. Allocate a database


dodil vbase db create product-search


Database product-search created.
ID: svc-7f3a9c2e

Allocation is asynchronous — the database starts in CREATING and becomes connectable once it reaches RUNNING. Capture the printed ID; it’s the service_id you pass to db use. Watch the status with:


dodil vbase db list


ID            Name             Status    DRN
svc-7f3a9c2e  product-search   RUNNING   drn:dodil:vbase:svc-7f3a9c2e

2. Select it as your active database

Once it’s RUNNING, resolve and store its connection details so the Milvus-direct commands know where to connect:


dodil vbase db use svc-7f3a9c2e


Switched to database 'svc-7f3a9c2e' at endpoint product-search.vbase.dev.dodil.io:443

Under the hood this calls GetServiceAccess and writes the endpoint, port, and db_name into your CLI config — that’s what lets the next commands run without endpoint flags.

3. Create a collection

A primary key plus one 768-dimensional vector field:


dodil vbase collection create products \
  --dim 768 \
  --id-type varchar --id-max-length 128

4. Build the HNSW / COSINE index

Index the vector field with HNSW (a strong latency/recall default) and the COSINE metric (angular similarity — the right choice for most normalized text embeddings):


dodil vbase index create products vector --type HNSW --metric COSINE

5. Insert a vector

The CLI inserts one row at a time — a single id and one float vector:


dodil vbase data insert products \
  --id doc-1 \
  --vector "0.12,0.04,0.91,0.33"

6. Load and search

A collection must be loaded into memory before you can search it:


dodil vbase collection load products
 
dodil vbase data search products \
  --vector "0.12,0.04,0.91,0.33" \
  --topk 5


Result ID   Score
doc-1       0.030000
doc-9       0.142000

CLI search is intentionally minimal. dodil vbase data search always queries with metric_type=L2 and returns only the id and score — it does not honor a COSINE index, output fields, or filter expressions. For first-class CLI search keep the index on L2; for COSINE search, metadata filters, output_fields, bulk inserts, or int64 ids, use the Milvus SDK below. See the CLI guide for the full command surface.

The programmatic way (Python)

Apps and agents skip the CLI and talk to Milvus directly. You need two things: an IAM access token and the three connection values from your allocated database.

Install the SDK


pip install "pymilvus>=2.6,<2.7"

1. Get an IAM access token

Applications authenticate with a Service Account using the OAuth 2.0 client-credentials grant. POST your client_id and client_secret to the IAM token endpoint; the access_token in the response is your bearer token — and it is also your Milvus token.


import os, requests
 
TOKEN_URL = "https://id.dev.dodil.io/realms/dodil/protocol/openid-connect/token"
 
def get_token() -> str:
    resp = requests.post(TOKEN_URL, data={
        "grant_type": "client_credentials",
        "client_id": os.environ["DODIL_SERVICE_ACCOUNT_ID"],
        "client_secret": os.environ["DODIL_SERVICE_ACCOUNT_SECRET"],
    })
    resp.raise_for_status()
    return resp.json()["access_token"]

Tokens are short-lived — cache one and refresh it shortly before it expires. The full flow (curl, Python, Node.js, Go) is on the Get an Access Token page.

2. Connect the Milvus SDK

Three values are all you need. The endpoint, port, and db_name come from GetServiceAccess (or dodil vbase db use); the token is the IAM token from step 1.


from pymilvus import MilvusClient, DataType
 
client = MilvusClient(
    uri="https://<endpoint>:443",   # endpoint + port from GetServiceAccess / `dodil vbase db use`
    token=get_token(),               # your IAM token IS your Milvus token
    db_name="<db_name>",             # the allocated database
)

3. Define the schema and HNSW / COSINE index

Define a schema with a primary key, a 768-dim dense vector, and metadata you’ll display and filter on. Then prepare an HNSW index with the COSINE metric:


schema = client.create_schema(
    auto_id=False,
    enable_dynamic_field=True,   # extra keys in your rows are stored as dynamic fields
)
schema.add_field("id", DataType.VARCHAR, is_primary=True, max_length=64)
schema.add_field("embedding", DataType.FLOAT_VECTOR, dim=768)
schema.add_field("text", DataType.VARCHAR, max_length=8192)
schema.add_field("category", DataType.VARCHAR, max_length=64)
 
index_params = client.prepare_index_params()
index_params.add_index(
    field_name="embedding",
    index_type="HNSW",
    metric_type="COSINE",
    params={"M": 16, "efConstruction": 200},
)

4. Create the collection

Pass the schema and index in one call, so VBase builds the index and the collection is ready to load:


client.create_collection(
    collection_name="documents",
    schema=schema,
    index_params=index_params,
)

5. Insert embeddings

Generate dense vectors with your embedding model, then insert rows as dicts. Here embed() stands in for your model:


def embed(text: str) -> list[float]:
    """Replace with your embedding model — must return a 768-dim vector."""
    ...
 
rows = [
    {
        "id": "doc-1",
        "embedding": embed("VBase is Milvus-SDK-compatible and runs Milvus 2.6."),
        "text": "VBase is Milvus-SDK-compatible and runs Milvus 2.6.",
        "category": "product",
    },
    {
        "id": "doc-2",
        "embedding": embed("Your IAM token is your Milvus token."),
        "text": "Your IAM token is your Milvus token.",
        "category": "auth",
    },
    {
        "id": "doc-3",
        "embedding": embed("Build an HNSW index with the COSINE metric."),
        "text": "Build an HNSW index with the COSINE metric.",
        "category": "product",
    },
]
 
client.insert(collection_name="documents", data=rows)

6. Load and search

Load the collection into memory, then search. Embed the query the same way you embedded the documents, request the output_fields you want back, and (optionally) restrict the search with a metadata filter:


client.load_collection(collection_name="documents")
 
query_vector = embed("How do I index vectors in VBase?")
 
results = client.search(
    collection_name="documents",
    data=[query_vector],
    anns_field="embedding",
    filter='category == "product"',
    limit=3,
    search_params={"metric_type": "COSINE", "params": {"ef": 64}},
    output_fields=["text", "category"],
)
 
for hit in results[0]:
    print(hit["id"], round(hit["distance"], 4), hit["entity"]["text"])


doc-3 0.8124 Build an HNSW index with the COSINE metric.
doc-1 0.7710 VBase is Milvus-SDK-compatible and runs Milvus 2.6.

The auth document is absent because the category == "product" filter excluded it before scoring. Larger ef improves recall at the cost of latency, independent of the efConstruction used to build the index.

Semantic search

What you’ll build

The CLI way

1. Allocate a database

2. Select it as your active database

3. Create a collection

4. Build the HNSW / COSINE index

5. Insert a vector

6. Load and search

The programmatic way (Python)

Install the SDK

1. Get an IAM access token

2. Connect the Milvus SDK

3. Define the schema and HNSW / COSINE index

4. Create the collection

5. Insert embeddings

6. Load and search

See also