Hybrid Search (Dense + BM25)

Goal: combine semantic similarity with keyword precision in a single ranked result. Dense vectors are great at meaning but miss rare tokens and exact identifiers; BM25 keyword scoring catches those. Milvus 2.6 runs both searches and fuses them with a reranker — this is one of its flagship capabilities.

You will build a collection with a dense vector field and a BM25 sparse field that Milvus populates automatically from raw text, then issue one AnnSearchRequest per field and combine them with an RRF or weighted reranker. Everything here is plain Milvus 2.6; for the full reference see the Milvus hybrid search docs and full-text / BM25 docs .

Before you start

You need an allocated database in RUNNING state and an IAM token — see the Quickstart and Connecting with the Milvus SDK.


pip install "pymilvus>=2.6,<2.7"

Connect


from pymilvus import (
    MilvusClient,
    DataType,
    Function,
    FunctionType,
    AnnSearchRequest,
    RRFRanker,
    WeightedRanker,
)
 
client = MilvusClient(
    uri="https://<endpoint>:443",   # endpoint + port from GetServiceAccess / `dodil vbase db use`
    token="<IAM access token>",      # your IAM service-account token IS your Milvus token
    db_name="<db_name>",             # the allocated database
)

Define the schema with a BM25 function

The schema has a text field you write to directly, a dense vector field for semantics, and a sparse field of type SPARSE_FLOAT_VECTOR. You do not compute the sparse vector yourself: a Milvus BM25 Function reads the text field and produces the sparse term weights at insert and query time.


schema = client.create_schema(auto_id=False, enable_dynamic_field=True)
schema.add_field("id", DataType.VARCHAR, is_primary=True, max_length=64)
# enable_analyzer=True lets Milvus tokenize this text for BM25.
schema.add_field("text", DataType.VARCHAR, max_length=8192, enable_analyzer=True)
schema.add_field("dense", DataType.FLOAT_VECTOR, dim=768)
schema.add_field("sparse", DataType.SPARSE_FLOAT_VECTOR)
 
# BM25: turn `text` into the `sparse` vector automatically.
bm25 = Function(
    name="text_to_sparse",
    function_type=FunctionType.BM25,
    input_field_names=["text"],
    output_field_names=["sparse"],
)
schema.add_function(bm25)

With the BM25 function attached, you only ever insert and query raw text for the keyword side — Milvus tokenizes and weights it for you. The sparse field is computed, never written by hand.

Index both vector fields

Each vector field needs its own index. The dense field uses HNSW / COSINE; the BM25 sparse field uses the SPARSE_INVERTED_INDEX with the BM25 metric.


index_params = client.prepare_index_params()
index_params.add_index(
    field_name="dense",
    index_type="HNSW",
    metric_type="COSINE",
    params={"M": 16, "efConstruction": 200},
)
index_params.add_index(
    field_name="sparse",
    index_type="SPARSE_INVERTED_INDEX",
    metric_type="BM25",
)
 
client.create_collection(
    collection_name="articles",
    schema=schema,
    index_params=index_params,
)

Insert text and dense vectors

Insert the raw text and the dense embedding. The sparse field is omitted — the BM25 function fills it in.


def embed(text: str) -> list[float]:
    """Replace with your embedding model — must return a 768-dim vector."""
    ...
 
rows = [
    {"id": "a-1", "text": "VBase runs Milvus 2.6 with hybrid BM25 search.", "dense": embed("VBase runs Milvus 2.6 with hybrid BM25 search.")},
    {"id": "a-2", "text": "Reset password for service account SA-1839.",     "dense": embed("Reset password for service account SA-1839.")},
    {"id": "a-3", "text": "Build an HNSW index with the COSINE metric.",      "dense": embed("Build an HNSW index with the COSINE metric.")},
]
 
client.insert(collection_name="articles", data=rows)
client.load_collection(collection_name="articles")

Run the hybrid search

Build one AnnSearchRequest per field. The dense request takes your query embedding; the BM25 request takes the raw query string (Milvus tokenizes it). Then pass both to hybrid_search with a reranker.


query_text = "service account SA-1839"
 
dense_req = AnnSearchRequest(
    data=[embed(query_text)],
    anns_field="dense",
    param={"metric_type": "COSINE", "params": {"ef": 64}},
    limit=10,
)
 
sparse_req = AnnSearchRequest(
    data=[query_text],                 # raw text — BM25 tokenizes it for you
    anns_field="sparse",
    param={"metric_type": "BM25"},
    limit=10,
)
 
results = client.hybrid_search(
    collection_name="articles",
    reqs=[dense_req, sparse_req],
    ranker=RRFRanker(k=60),            # rank-based fusion; see weighted alternative below
    limit=5,
    output_fields=["text"],
)
 
for hit in results[0]:
    print(hit["id"], round(hit["distance"], 4), hit["entity"]["text"])


a-2 0.0328 Reset password for service account SA-1839.
a-1 0.0161 VBase runs Milvus 2.6 with hybrid BM25 search.

The exact-identifier query (SA-1839) surfaces a-2 first because BM25 matched the rare token, even though the dense embedding alone would rank it lower.

Choosing a reranker

Both rerankers take the per-field result lists and produce one fused ranking.

Reranker	What it fuses	Use when
`RRFRanker(k=60)`	The ranks (ordering) from each request	Dense and BM25 scores live on different scales — rank fusion is the stable default.
`WeightedRanker(w_dense, w_sparse)`	The scores, each scaled by a weight	You want explicit control over how much each signal counts.

A weighted variant that leans on semantics:


results = client.hybrid_search(
    collection_name="articles",
    reqs=[dense_req, sparse_req],
    ranker=WeightedRanker(0.7, 0.3),   # weights align to reqs order: dense, then sparse
    limit=5,
    output_fields=["text"],
)

Weights in WeightedRanker are positional — they line up with the order of reqs. Start at 0.7 / 0.3 or 0.6 / 0.4 and tune on your own relevance judgments.

Notes

Metering. The insert is metered as VectorWrite and the stored vectors (dense + sparse) count toward VectorStorage; each hybrid_search is metered as VectorRead. Usage is scoped to your organization’s quota.
What VBase manages. The Milvus service, its users, and RBAC. You operate purely at the schema/function/index/search level with your IAM token.
You never compute BM25 vectors. The Function does it at insert and query time. If you already produce sparse vectors externally (e.g. SPLADE), you can instead define a plain SPARSE_FLOAT_VECTOR field with no function and insert the term weights directly — see the Milvus sparse vector docs .
Reference. Full hybrid search, BM25, and reranker options are in the Milvus documentation .