HNSW

HNSW (Hierarchical Navigable Small World) is a graph-based index designed for high-dimensional dense vectors (embeddings). It typically delivers excellent recall and low latency, but it uses more memory than many IVF-style indexes because it maintains a multi-layer graph.

If you want high accuracy with fast search, HNSW is often the first index to try.

How HNSW works (intuition)

Think of HNSW as a map with multiple zoom levels:

Top layers contain fewer points and let the search “jump” long distances quickly.
Bottom layer contains all points and does the final, fine-grained nearest-neighbor search.

A query starts from an entry point in the top layer, greedily moves to closer neighbors, then descends layer by layer until it reaches the bottom layer.

When to use HNSW

Use HNSW when:

You need high recall (you care about result quality).
You expect low-latency queries.
You can afford higher memory overhead.

Consider a different index (for example, IVF variants) when:

You have a very large dataset and memory is the primary constraint.
You can trade a bit of recall for smaller memory footprint and/or faster inserts.

Key parameters

HNSW performance is controlled by a few parameters:

Build-time parameters

M: maximum number of graph connections per node.
- Higher M → better recall (more paths to explore), but more memory and slower inserts/build.
efConstruction: candidate pool size during index construction.
- Higher efConstruction → better graph quality (often higher recall), but slower index build.

Search-time parameter

ef: how wide the search explores the graph (bottom layer).
- Higher ef → higher recall, but slower queries.

Recommended tuning (practical)

Start with sane defaults, measure recall/latency, then adjust:

Start with M = 16–64
Start with efConstruction = 100–360
Start with ef = max(64, top_k)

Rules of thumb:

If recall is too low → increase ef first (cheapest change), then efConstruction, then M.
If memory is too high or inserts are slow → decrease M (largest impact), then reduce efConstruction.
If queries are too slow → reduce ef.

Create an HNSW index in Dodil

Below is an example using the Dodil SDK (Python 3.10+).

Note: Dodil VBase is Milvus-backed, so index types and parameters follow Milvus semantics.


from dodil import Client
from dodil.vbase import VBaseConfig
 
# Authorize
c = Client(
    service_account_id="...",
    service_account_secret="...",
)
 
# Connect to your VBase database
vbase = c.vbase.connect(
    VBaseConfig(
        host="vbase-db-<id>.infra.dodil.cloud",
        port=443,
        scheme="https",
        db_name="db_<id>",
    )
)
 
# Build an HNSW index on a vector field (example: "embedding")
vbase.create_index(
    collection_name="my_collection",
    field_name="embedding",
    index_name="embedding_hnsw",
    index_type="HNSW",
    metric_type="COSINE",  # or "L2" / "IP"
    params={
        "M": 64,
        "efConstruction": 200,
    },
)

What these settings mean

metric_type defines how similarity is measured:
- COSINE for cosine similarity (common for normalized embeddings)
- L2 for Euclidean distance
- IP for inner product
params are HNSW-specific build settings:
- M: graph degree
- efConstruction: build candidate breadth

Search using HNSW

At query time, you mainly tune ef.


query_vec = [0.1, 0.2, 0.3, 0.4]  # example
 
results = vbase.search(
    collection_name="my_collection",
    anns_field="embedding",
    data=[query_vec],
    limit=10,
    search_params={
        "params": {
            "ef": 128,
        }
    },
)
 
for hit in results[0]:
    print(hit)

Choosing `ef`

If top_k = 10, try ef = 64 or ef = 128.
Increase ef until recall stops improving enough to justify the extra latency.

Operational notes

Memory usage is the main cost of HNSW. Plan capacity accordingly.
If you change index parameters (M, efConstruction), you typically need to rebuild the index.
If you see slow inserts, consider lowering M or efConstruction.

Quick checklist

Want best accuracy? Increase ef.
Want better index quality? Increase efConstruction.
Want higher recall but can spend memory? Increase M.
Want faster queries (accept lower recall)? Decrease ef.