Semantic Search

Goal: stand up a RAG-style collection on a VBase database and run your first semantic search. You will define a schema with an id, a dense vector, and a couple of metadata fields; build an HNSW index with the COSINE metric; insert and upsert embeddings; then search with a metadata filter and read the results.

This is the “Hello, VBase” recipe. Every step is plain Milvus 2.6 — VBase just allocates the database and authenticates you. For the exhaustive parameter reference, see the Milvus documentation .

Before you start

You need an allocated database in RUNNING state and an IAM token — see the Quickstart and Connecting with the Milvus SDK. Then install the SDK:


pip install "pymilvus>=2.6,<2.7"

Connect

Use the standard connection pattern. The endpoint, port, and db_name come from GetServiceAccess (or dodil vbase db use); the token is your IAM bearer token.


from pymilvus import MilvusClient, DataType
 
client = MilvusClient(
    uri="https://<endpoint>:443",   # endpoint + port from GetServiceAccess / `dodil vbase db use`
    token="<IAM access token>",      # your IAM service-account token IS your Milvus token
    db_name="<db_name>",             # the allocated database
)

Define the schema

The collection holds three things: a primary key, a 768-dimensional dense vector, and metadata you will both display and filter on (text, category).


schema = client.create_schema(
    auto_id=False,
    enable_dynamic_field=True,   # extra keys in your rows are stored as dynamic fields
)
schema.add_field("id", DataType.VARCHAR, is_primary=True, max_length=64)
schema.add_field("embedding", DataType.FLOAT_VECTOR, dim=768)
schema.add_field("text", DataType.VARCHAR, max_length=8192)
schema.add_field("category", DataType.VARCHAR, max_length=64)

Build the HNSW / COSINE index

Index the vector field before you load the collection. HNSW is a strong latency/recall default, and COSINE measures angular similarity — the right metric for most normalized text embeddings.


index_params = client.prepare_index_params()
index_params.add_index(
    field_name="embedding",
    index_type="HNSW",
    metric_type="COSINE",
    params={"M": 16, "efConstruction": 200},
)

M and efConstruction trade build time and memory for recall. The defaults above are a good starting point; tune them per the Milvus index docs .

Create the collection

Create it with the schema and index in one call. Passing index_params means VBase builds the index and the collection is ready to load.


client.create_collection(
    collection_name="documents",
    schema=schema,
    index_params=index_params,
)

Insert embeddings

Generate your dense vectors with whatever embedding model you use, then insert rows as dicts. Here we use a placeholder embed() to stand in for your model.


def embed(text: str) -> list[float]:
    """Replace with your embedding model — must return a 768-dim vector."""
    ...
 
rows = [
    {
        "id": "doc-1",
        "embedding": embed("VBase is Milvus-SDK-compatible and runs Milvus 2.6."),
        "text": "VBase is Milvus-SDK-compatible and runs Milvus 2.6.",
        "category": "product",
    },
    {
        "id": "doc-2",
        "embedding": embed("Your IAM token is your Milvus token."),
        "text": "Your IAM token is your Milvus token.",
        "category": "auth",
    },
    {
        "id": "doc-3",
        "embedding": embed("Build an HNSW index with the COSINE metric."),
        "text": "Build an HNSW index with the COSINE metric.",
        "category": "product",
    },
]
 
client.insert(collection_name="documents", data=rows)

Upsert to update in place

Because auto_id=False, you control the primary key, so re-inserting the same id with upsert replaces the row — ideal when a document’s content (and therefore its embedding) changes.


client.upsert(
    collection_name="documents",
    data=[{
        "id": "doc-2",
        "embedding": embed("Your IAM token is also your Milvus token — VBase manages RBAC."),
        "text": "Your IAM token is also your Milvus token — VBase manages RBAC.",
        "category": "auth",
    }],
)

Load the collection

A collection must be loaded into memory before you can search it. Load it once after ingest; reload only after a release.


client.load_collection(collection_name="documents")

Search with a metadata filter

Embed the query the same way you embedded the documents, then search. The filter is a standard Milvus boolean expression evaluated against scalar fields — here we restrict the search to the product category. Request the output_fields you want returned alongside the score.


query_vector = embed("How do I index vectors in VBase?")
 
results = client.search(
    collection_name="documents",
    data=[query_vector],
    anns_field="embedding",
    filter='category == "product"',
    limit=3,
    search_params={"metric_type": "COSINE", "params": {"ef": 64}},
    output_fields=["text", "category"],
)

Read the results

search returns one result list per query vector. Each hit carries the primary key, the similarity distance (a COSINE score here — higher is closer), and the requested fields under entity.


for hit in results[0]:
    print(hit["id"], round(hit["distance"], 4), hit["entity"]["text"])


doc-3 0.8124 Build an HNSW index with the COSINE metric.
doc-1 0.7710 VBase is Milvus-SDK-compatible and runs Milvus 2.6.

The auth document is absent because the category == "product" filter excluded it before scoring.

Notes

Metering. The insert/upsert calls are metered as VectorWrite, and the stored vectors count toward VectorStorage; each search is metered as VectorRead. All usage is scoped to your organization’s quota.
What VBase manages. The Milvus service behind your database, its users, and RBAC are managed for you. You authenticate with your IAM token and operate purely at the collection/index/search level.
ef at search time. Larger ef improves recall at the cost of latency. It is independent of the efConstruction used to build the index.
Reference. For the full set of field types, index parameters, and search options, see the Milvus documentation .