Skip to Content
We are live but in Staging 🎉

Overview

When you run a vector search, it’s common to get many results that are almost the same.

Example: you indexed your knowledge base by chunk, so a single document might produce dozens of chunks that all match the query. A normal search could return the “top 10” results… and 8 of them might come from the same document.

Grouping search solves this by letting you group results by a scalar field (like doc_id, user_id, product_id, or page_id) and then return:

  • limit groups (how many distinct groups you want), and
  • up to group_size entities per group (how many results you want from each group).

This is ideal for:

  • RAG / knowledge bases: return results from many documents, not just one.
  • E-commerce: return results across many products/brands instead of one popular SKU.
  • Multi-tenant apps: group by workspace_id / org_id to keep results balanced.

How it works

Grouping search adds three key knobs:

  • group_by_field: the field used to form groups (must be a scalar field).
  • limit: the number of groups you want returned.
  • group_size: the number of entities to return from each group.

The important detail about limit

In grouping search, limit means “number of groups” — not “number of rows”.

So if you set:

  • limit = 5
  • group_size = 2

You’ll get up to 10 results total (5 groups × 2 entities per group), depending on data distribution.

The examples below assume you already have a connected vbase client.

Example: group by doc_id to avoid duplicates

Use this pattern when your collection stores chunked text and you want diverse sources.

# Example only — parameter names may differ slightly depending on the Dodil SDK version. # The key idea is: # - group by a scalar field (doc_id) # - limit controls groups # - group_size controls results per group results = vbase.search( collection_name="kb_chunks", data=[query_vector], anns_field="embedding", limit=5, # return up to 5 groups (distinct doc_id values) group_by_field="doc_id", # group results by doc_id group_size=2, # up to 2 chunks per document output_fields=["doc_id", "chunk_id", "text"], ) for hit in results[0]: print(hit.entity.get("doc_id"), hit.entity.get("chunk_id"), hit.distance)

Example: return one result per group

If you only need one “best” match per group, set group_size=1.

results = vbase.search( collection_name="kb_chunks", data=[query_vector], anns_field="embedding", limit=10, # 10 distinct documents group_by_field="doc_id", group_size=1, )

Configure group size

group_size

group_size controls how many entities are returned within each group.

  • Larger values can make responses richer (more chunks per document).
  • But if your data is uneven (some groups have few rows), some groups may return fewer than group_size.

strict_group_size

strict_group_size controls whether the system tries to enforce group_size.

  • When strict_group_size=True, the system will try to return exactly group_size entities per group (unless the group doesn’t have enough rows).
  • When strict_group_size=False (default), the system prioritizes returning the requested number of groups (limit), even if that means some groups contain fewer than group_size entities.

In practice:

  • Use strict_group_size=False for better performance and more predictable group coverage.
  • Use strict_group_size=True only if you truly need a consistent number of results per group.
results = vbase.search( collection_name="kb_chunks", data=[query_vector], anns_field="embedding", limit=5, group_by_field="doc_id", group_size=2, strict_group_size=False, )

Considerations

Indexing requirement

Grouping search works only when the collection is indexed with one of these index types:

  • FLAT
  • IVF_FLAT
  • IVF_SQ8
  • HNSW
  • HNSW_PQ
  • HNSW_PRQ
  • HNSW_SQ
  • DISKANN
  • SPARSE_INVERTED_INDEX

If your collection has no index, grouping search will fail.

Performance tips

  • Keep limit reasonable: it controls how many groups are explored/returned.
  • Increase group_size only when needed: it multiplies the number of returned entities.
  • Prefer strict_group_size=False in uneven datasets: it’s typically faster.

If your query vectors already exist

If the vector you want to search with is already stored in the same collection, using an ID-based lookup (search by ids) can be more efficient than fetching vectors first and then searching.

Common patterns

“One document per result” (classic RAG)

  • group_by_field="doc_id"
  • limit = number_of_documents_you_want
  • group_size = 1

“Few chunks per document” (richer context)

  • group_by_field="doc_id"
  • limit = number_of_documents_you_want
  • group_size = 2..5

“One product per brand” (catalog diversity)

  • group_by_field="brand_id"
  • limit = number_of_brands
  • group_size = 1
Last updated on