Hybrid Search (Dense + BM25)
Goal: combine semantic similarity with keyword precision in a single ranked result. Dense vectors are great at meaning but miss rare tokens and exact identifiers; BM25 keyword scoring catches those. Milvus 2.6 runs both searches and fuses them with a reranker β this is one of its flagship capabilities.
You will build a collection with a dense vector field and a BM25 sparse field that Milvus populates automatically from raw text, then issue one AnnSearchRequest per field and combine them with an RRF or weighted reranker. Everything here is plain Milvus 2.6; for the full reference see the Milvus hybrid search docsΒ and full-text / BM25 docsΒ .
Before you start
You need an allocated database in RUNNING state and an IAM token β see the Quickstart and Connecting with the Milvus SDK.
pip install "pymilvus>=2.6,<2.7"Connect
from pymilvus import (
MilvusClient,
DataType,
Function,
FunctionType,
AnnSearchRequest,
RRFRanker,
WeightedRanker,
)
client = MilvusClient(
uri="https://<endpoint>:443", # endpoint + port from GetServiceAccess / `dodil vbase db use`
token="<IAM access token>", # your IAM service-account token IS your Milvus token
db_name="<db_name>", # the allocated database
)Define the schema with a BM25 function
The schema has a text field you write to directly, a dense vector field for semantics, and a sparse field of type SPARSE_FLOAT_VECTOR. You do not compute the sparse vector yourself: a Milvus BM25 Function reads the text field and produces the sparse term weights at insert and query time.
schema = client.create_schema(auto_id=False, enable_dynamic_field=True)
schema.add_field("id", DataType.VARCHAR, is_primary=True, max_length=64)
# enable_analyzer=True lets Milvus tokenize this text for BM25.
schema.add_field("text", DataType.VARCHAR, max_length=8192, enable_analyzer=True)
schema.add_field("dense", DataType.FLOAT_VECTOR, dim=768)
schema.add_field("sparse", DataType.SPARSE_FLOAT_VECTOR)
# BM25: turn `text` into the `sparse` vector automatically.
bm25 = Function(
name="text_to_sparse",
function_type=FunctionType.BM25,
input_field_names=["text"],
output_field_names=["sparse"],
)
schema.add_function(bm25)With the BM25 function attached, you only ever insert and query raw text for the keyword side β Milvus tokenizes and weights it for you. The
sparsefield is computed, never written by hand.
Index both vector fields
Each vector field needs its own index. The dense field uses HNSW / COSINE; the BM25 sparse field uses the SPARSE_INVERTED_INDEX with the BM25 metric.
index_params = client.prepare_index_params()
index_params.add_index(
field_name="dense",
index_type="HNSW",
metric_type="COSINE",
params={"M": 16, "efConstruction": 200},
)
index_params.add_index(
field_name="sparse",
index_type="SPARSE_INVERTED_INDEX",
metric_type="BM25",
)
client.create_collection(
collection_name="articles",
schema=schema,
index_params=index_params,
)Insert text and dense vectors
Insert the raw text and the dense embedding. The sparse field is omitted β the BM25 function fills it in.
def embed(text: str) -> list[float]:
"""Replace with your embedding model β must return a 768-dim vector."""
...
rows = [
{"id": "a-1", "text": "VBase runs Milvus 2.6 with hybrid BM25 search.", "dense": embed("VBase runs Milvus 2.6 with hybrid BM25 search.")},
{"id": "a-2", "text": "Reset password for service account SA-1839.", "dense": embed("Reset password for service account SA-1839.")},
{"id": "a-3", "text": "Build an HNSW index with the COSINE metric.", "dense": embed("Build an HNSW index with the COSINE metric.")},
]
client.insert(collection_name="articles", data=rows)
client.load_collection(collection_name="articles")Run the hybrid search
Build one AnnSearchRequest per field. The dense request takes your query embedding; the BM25 request takes the raw query string (Milvus tokenizes it). Then pass both to hybrid_search with a reranker.
query_text = "service account SA-1839"
dense_req = AnnSearchRequest(
data=[embed(query_text)],
anns_field="dense",
param={"metric_type": "COSINE", "params": {"ef": 64}},
limit=10,
)
sparse_req = AnnSearchRequest(
data=[query_text], # raw text β BM25 tokenizes it for you
anns_field="sparse",
param={"metric_type": "BM25"},
limit=10,
)
results = client.hybrid_search(
collection_name="articles",
reqs=[dense_req, sparse_req],
ranker=RRFRanker(k=60), # rank-based fusion; see weighted alternative below
limit=5,
output_fields=["text"],
)
for hit in results[0]:
print(hit["id"], round(hit["distance"], 4), hit["entity"]["text"])a-2 0.0328 Reset password for service account SA-1839.
a-1 0.0161 VBase runs Milvus 2.6 with hybrid BM25 search.The exact-identifier query (SA-1839) surfaces a-2 first because BM25 matched the rare token, even though the dense embedding alone would rank it lower.
Choosing a reranker
Both rerankers take the per-field result lists and produce one fused ranking.
| Reranker | What it fuses | Use when |
|---|---|---|
RRFRanker(k=60) | The ranks (ordering) from each request | Dense and BM25 scores live on different scales β rank fusion is the stable default. |
WeightedRanker(w_dense, w_sparse) | The scores, each scaled by a weight | You want explicit control over how much each signal counts. |
A weighted variant that leans on semantics:
results = client.hybrid_search(
collection_name="articles",
reqs=[dense_req, sparse_req],
ranker=WeightedRanker(0.7, 0.3), # weights align to reqs order: dense, then sparse
limit=5,
output_fields=["text"],
)Weights in
WeightedRankerare positional β they line up with the order ofreqs. Start at0.7 / 0.3or0.6 / 0.4and tune on your own relevance judgments.
Notes
- Metering. The
insertis metered as VectorWrite and the stored vectors (dense + sparse) count toward VectorStorage; eachhybrid_searchis metered as VectorRead. Usage is scoped to your organizationβs quota. - What VBase manages. The Milvus service, its users, and RBAC. You operate purely at the schema/function/index/search level with your IAM token.
- You never compute BM25 vectors. The
Functiondoes it at insert and query time. If you already produce sparse vectors externally (e.g. SPLADE), you can instead define a plainSPARSE_FLOAT_VECTORfield with no function and insert the term weights directly β see the Milvus sparse vector docsΒ . - Reference. Full hybrid search, BM25, and reranker options are in the Milvus documentationΒ .
See also
- Recipes β the full set of end-to-end workflows.
- Semantic Search β the dense-only baseline this builds on.
- Connecting with the Milvus SDK β obtain your
endpoint,port, anddb_name. - Databases API β allocate a database and resolve its access.
- Quickstart β allocate, connect, and search end to end.
- Milvus documentationΒ β full SDK and API reference.