Semantic Search
Goal: stand up a RAG-style collection on a VBase database and run your first semantic search. You will define a schema with an id, a dense vector, and a couple of metadata fields; build an HNSW index with the COSINE metric; insert and upsert embeddings; then search with a metadata filter and read the results.
This is the “Hello, VBase” recipe. Every step is plain Milvus 2.6 — VBase just allocates the database and authenticates you. For the exhaustive parameter reference, see the Milvus documentation .
Before you start
You need an allocated database in RUNNING state and an IAM token — see the Quickstart and Connecting with the Milvus SDK. Then install the SDK:
pip install "pymilvus>=2.6,<2.7"Connect
Use the standard connection pattern. The endpoint, port, and db_name come from GetServiceAccess (or dodil vbase db use); the token is your IAM bearer token.
from pymilvus import MilvusClient, DataType
client = MilvusClient(
uri="https://<endpoint>:443", # endpoint + port from GetServiceAccess / `dodil vbase db use`
token="<IAM access token>", # your IAM service-account token IS your Milvus token
db_name="<db_name>", # the allocated database
)Define the schema
The collection holds three things: a primary key, a 768-dimensional dense vector, and metadata you will both display and filter on (text, category).
schema = client.create_schema(
auto_id=False,
enable_dynamic_field=True, # extra keys in your rows are stored as dynamic fields
)
schema.add_field("id", DataType.VARCHAR, is_primary=True, max_length=64)
schema.add_field("embedding", DataType.FLOAT_VECTOR, dim=768)
schema.add_field("text", DataType.VARCHAR, max_length=8192)
schema.add_field("category", DataType.VARCHAR, max_length=64)Build the HNSW / COSINE index
Index the vector field before you load the collection. HNSW is a strong latency/recall default, and COSINE measures angular similarity — the right metric for most normalized text embeddings.
index_params = client.prepare_index_params()
index_params.add_index(
field_name="embedding",
index_type="HNSW",
metric_type="COSINE",
params={"M": 16, "efConstruction": 200},
)
MandefConstructiontrade build time and memory for recall. The defaults above are a good starting point; tune them per the Milvus index docs .
Create the collection
Create it with the schema and index in one call. Passing index_params means VBase builds the index and the collection is ready to load.
client.create_collection(
collection_name="documents",
schema=schema,
index_params=index_params,
)Insert embeddings
Generate your dense vectors with whatever embedding model you use, then insert rows as dicts. Here we use a placeholder embed() to stand in for your model.
def embed(text: str) -> list[float]:
"""Replace with your embedding model — must return a 768-dim vector."""
...
rows = [
{
"id": "doc-1",
"embedding": embed("VBase is Milvus-SDK-compatible and runs Milvus 2.6."),
"text": "VBase is Milvus-SDK-compatible and runs Milvus 2.6.",
"category": "product",
},
{
"id": "doc-2",
"embedding": embed("Your IAM token is your Milvus token."),
"text": "Your IAM token is your Milvus token.",
"category": "auth",
},
{
"id": "doc-3",
"embedding": embed("Build an HNSW index with the COSINE metric."),
"text": "Build an HNSW index with the COSINE metric.",
"category": "product",
},
]
client.insert(collection_name="documents", data=rows)Upsert to update in place
Because auto_id=False, you control the primary key, so re-inserting the same id with upsert replaces the row — ideal when a document’s content (and therefore its embedding) changes.
client.upsert(
collection_name="documents",
data=[{
"id": "doc-2",
"embedding": embed("Your IAM token is also your Milvus token — VBase manages RBAC."),
"text": "Your IAM token is also your Milvus token — VBase manages RBAC.",
"category": "auth",
}],
)Load the collection
A collection must be loaded into memory before you can search it. Load it once after ingest; reload only after a release.
client.load_collection(collection_name="documents")Search with a metadata filter
Embed the query the same way you embedded the documents, then search. The filter is a standard Milvus boolean expression evaluated against scalar fields — here we restrict the search to the product category. Request the output_fields you want returned alongside the score.
query_vector = embed("How do I index vectors in VBase?")
results = client.search(
collection_name="documents",
data=[query_vector],
anns_field="embedding",
filter='category == "product"',
limit=3,
search_params={"metric_type": "COSINE", "params": {"ef": 64}},
output_fields=["text", "category"],
)Read the results
search returns one result list per query vector. Each hit carries the primary key, the similarity distance (a COSINE score here — higher is closer), and the requested fields under entity.
for hit in results[0]:
print(hit["id"], round(hit["distance"], 4), hit["entity"]["text"])doc-3 0.8124 Build an HNSW index with the COSINE metric.
doc-1 0.7710 VBase is Milvus-SDK-compatible and runs Milvus 2.6.The auth document is absent because the category == "product" filter excluded it before scoring.
Notes
- Metering. The
insert/upsertcalls are metered as VectorWrite, and the stored vectors count toward VectorStorage; eachsearchis metered as VectorRead. All usage is scoped to your organization’s quota. - What VBase manages. The Milvus service behind your database, its users, and RBAC are managed for you. You authenticate with your IAM token and operate purely at the collection/index/search level.
efat search time. Largerefimproves recall at the cost of latency. It is independent of theefConstructionused to build the index.- Reference. For the full set of field types, index parameters, and search options, see the Milvus documentation .
See also
- Recipes — the full set of end-to-end workflows.
- Hybrid Search (Dense + BM25) — add keyword matching to this collection.
- Connecting with the Milvus SDK — obtain your
endpoint,port, anddb_name. - Databases API — allocate a database and resolve its access.
- Quickstart — the shortest path from zero to a search.
- Milvus documentation — full SDK and API reference.