Multi-vector/Hybrid search

Multi-vector search lets you search a single collection using more than one vector input—for example, searching with text + image embeddings (two different vector fields), or searching with multiple queries at once (batch search). This is useful when a single embedding isn’t enough to represent what the user means.

Common examples:

E-commerce: match a product using both an image embedding and a text description embedding.
Observability: match an incident using a log embedding + a trace embedding.
Security: match an alert using a behavior embedding + a graph/relationship embedding.

Under the hood, Dodil sends multiple ANN searches and then combines their scores into one final ranking.

When to use multi-vector search

Use multi-vector search when:

Your collection has multiple vector fields (e.g., text_vector, image_vector).
You want to blend multiple similarity signals into a single result set.
You want to batch multiple user queries in a single request.

If you only have one vector field and one query, a standard vector search is simpler.

Requirements

The collection must contain the vector fields you plan to search.
Each vector field should have an index enabled (recommended). Some deployments may error or perform poorly if you search an unindexed vector field.
Your query vectors must match the dimension and type of the target vector fields.

How ranking works

There are two common ways to combine multiple search results:

Weighted ranking

You assign a weight per vector field (or per request). The final score is a weighted combination of each request’s similarity score.

Example:

text similarity weight = 0.7
image similarity weight = 0.3

This is the easiest option when you know which signal should matter more.

Rank-based fusion

Instead of merging raw similarity scores, rank-based methods merge the ordering (ranks) from each request. This can be more stable when different vector fields have different score distributions.

If your SDK exposes a ranker/reranker option (for example, weighted or RRF), you can use that here.

Connect (if you don’t already have a session)

If you already have a vbase connection in your docs, skip this section.


from dodil import Client
from dodil.vbase import VBaseConfig
 
# Python 3.10+
 
c = Client(
    service_account_id="...",
    service_account_secret="...",
)
 
vbase = c.vbase.connect(
    VBaseConfig(
        host="vbase-db-<id>.infra.dodil.cloud",
        port=443,
        scheme="https",
        db_name="db_<id>",
    )
)

Example 1: Search two vector fields and blend results

In this example, we search the same collection using two different vector fields and merge the results.


# Assume you already created vectors elsewhere:
# text_vec: List[float]
# image_vec: List[float]
 
results = vbase.search(
    collection="products",
    requests=[
        {
            "vector_field": "text_vector",
            "query_vectors": [text_vec],
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.7,
        },
        {
            "vector_field": "image_vector",
            "query_vectors": [image_vec],
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.3,
        },
    ],
    output_fields=["id", "title", "price"],
)
 
for hit in results[0]:
    print(hit["id"], hit.get("title"), hit.get("score"))

Notes

top_k applies per request. The final merged output may include up to ~sum(top_k) candidates before final ranking.
Start with weight values that reflect the importance of each signal, then tune based on relevance.

Example 2: Batch multi-vector search (multiple queries in one call)

If you want to search for multiple users/queries at the same time, pass multiple query vectors for each request.


# Two users / two queries
text_vecs = [text_vec_user1, text_vec_user2]
image_vecs = [img_vec_user1, img_vec_user2]
 
batch_results = vbase.search(
    collection="products",
    requests=[
        {
            "vector_field": "text_vector",
            "query_vectors": text_vecs,
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.7,
        },
        {
            "vector_field": "image_vector",
            "query_vectors": image_vecs,
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.3,
        },
    ],
    output_fields=["id", "title"],
)
 
# batch_results is typically a list: one result set per input query
print(len(batch_results))
print(batch_results[0][0])

Example 3: Combine multi-vector search with filtering

You can still filter on non-vector fields (scalar filters) while doing multi-vector search.


results = vbase.search(
    collection="products",
    requests=[
        {
            "vector_field": "text_vector",
            "query_vectors": [text_vec],
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.7,
        },
        {
            "vector_field": "image_vector",
            "query_vectors": [image_vec],
            "metric_type": "COSINE",
            "params": {"nprobe": 16},
            "top_k": 10,
            "weight": 0.3,
        },
    ],
    filter='in_stock == true and price < 100',
    output_fields=["id", "title", "price", "in_stock"],
)

Troubleshooting

I only get results from one vector field

Ensure both requests use valid vector_field names.
Confirm both fields exist in the collection schema.
If you use weights, avoid setting one weight extremely low (e.g., 0.0).

It errors when searching a vector field

Ensure the query vector dimension matches the vector field dimension.
Ensure an index is created and the collection (or the relevant partitions) is loaded if your deployment requires it.

The results feel “off”

Tune weights (e.g., 0.6/0.4, 0.8/0.2).
Consider a rank-based fusion approach if your SDK exposes it.
Reduce top_k per request if you’re bringing too many weak candidates into the merge.