Overview
When you run a vector search, it’s common to get many results that are almost the same.
Example: you indexed your knowledge base by chunk, so a single document might produce dozens of chunks that all match the query. A normal search could return the “top 10” results… and 8 of them might come from the same document.
Grouping search solves this by letting you group results by a scalar field (like doc_id, user_id, product_id, or page_id) and then return:
limitgroups (how many distinct groups you want), and- up to
group_sizeentities per group (how many results you want from each group).
This is ideal for:
- RAG / knowledge bases: return results from many documents, not just one.
- E-commerce: return results across many products/brands instead of one popular SKU.
- Multi-tenant apps: group by
workspace_id/org_idto keep results balanced.
How it works
Grouping search adds three key knobs:
group_by_field: the field used to form groups (must be a scalar field).limit: the number of groups you want returned.group_size: the number of entities to return from each group.
The important detail about limit
In grouping search, limit means “number of groups” — not “number of rows”.
So if you set:
limit = 5group_size = 2
You’ll get up to 10 results total (5 groups × 2 entities per group), depending on data distribution.
Perform a grouping search
The examples below assume you already have a connected vbase client.
Example: group by doc_id to avoid duplicates
Use this pattern when your collection stores chunked text and you want diverse sources.
# Example only — parameter names may differ slightly depending on the Dodil SDK version.
# The key idea is:
# - group by a scalar field (doc_id)
# - limit controls groups
# - group_size controls results per group
results = vbase.search(
collection_name="kb_chunks",
data=[query_vector],
anns_field="embedding",
limit=5, # return up to 5 groups (distinct doc_id values)
group_by_field="doc_id", # group results by doc_id
group_size=2, # up to 2 chunks per document
output_fields=["doc_id", "chunk_id", "text"],
)
for hit in results[0]:
print(hit.entity.get("doc_id"), hit.entity.get("chunk_id"), hit.distance)Example: return one result per group
If you only need one “best” match per group, set group_size=1.
results = vbase.search(
collection_name="kb_chunks",
data=[query_vector],
anns_field="embedding",
limit=10, # 10 distinct documents
group_by_field="doc_id",
group_size=1,
)Configure group size
group_size
group_size controls how many entities are returned within each group.
- Larger values can make responses richer (more chunks per document).
- But if your data is uneven (some groups have few rows), some groups may return fewer than
group_size.
strict_group_size
strict_group_size controls whether the system tries to enforce group_size.
- When
strict_group_size=True, the system will try to return exactlygroup_sizeentities per group (unless the group doesn’t have enough rows). - When
strict_group_size=False(default), the system prioritizes returning the requested number of groups (limit), even if that means some groups contain fewer thangroup_sizeentities.
In practice:
- Use
strict_group_size=Falsefor better performance and more predictable group coverage. - Use
strict_group_size=Trueonly if you truly need a consistent number of results per group.
results = vbase.search(
collection_name="kb_chunks",
data=[query_vector],
anns_field="embedding",
limit=5,
group_by_field="doc_id",
group_size=2,
strict_group_size=False,
)Considerations
Indexing requirement
Grouping search works only when the collection is indexed with one of these index types:
FLATIVF_FLATIVF_SQ8HNSWHNSW_PQHNSW_PRQHNSW_SQDISKANNSPARSE_INVERTED_INDEX
If your collection has no index, grouping search will fail.
Performance tips
- Keep
limitreasonable: it controls how many groups are explored/returned. - Increase
group_sizeonly when needed: it multiplies the number of returned entities. - Prefer
strict_group_size=Falsein uneven datasets: it’s typically faster.
If your query vectors already exist
If the vector you want to search with is already stored in the same collection, using an ID-based lookup (search by ids) can be more efficient than fetching vectors first and then searching.
Common patterns
“One document per result” (classic RAG)
group_by_field="doc_id"limit = number_of_documents_you_wantgroup_size = 1
“Few chunks per document” (richer context)
group_by_field="doc_id"limit = number_of_documents_you_wantgroup_size = 2..5
“One product per brand” (catalog diversity)
group_by_field="brand_id"limit = number_of_brandsgroup_size = 1