Skip to Content
We are live but in Staging 🎉
CLI GuideMetric and Index Selection Cheat Sheet

Metric and Index Selection Cheat Sheet

Last validated: 2026-05-11

Use this page when you need a fast decision for metric and index settings without reading full command docs.

30-Second Decision Path

  1. Need the most accurate baseline for validation? Use FLAT + L2.
  2. Need low-latency production defaults? Use HNSW + L2.
  3. Need to scale to larger datasets? Move to IVF_FLAT + L2.
  4. Need lower memory usage at very large scale? Try IVF_SQ8 + L2.
  5. Need explicit COSINE or IP search-time behavior? Use RunCommand fallback.

Current CLI reality:

  • dodil vbase data search currently sends metric_type=L2.
  • First-class CLI index and search are easiest when metric stays aligned to L2.

Metric Quick Reference

Let query vector be $q$ and candidate vector be $x$.

MetricFormulaBetter scoreBest when
L2$d_{L2}(q,x)=\lVert q-x\rVert_2$SmallerYou care about absolute geometric distance.
IP$s_{IP}(q,x)=q\cdot x$LargerMagnitude and direction both matter.
COSINE$s_{cos}(q,x)=\frac{q\cdot x}{\lVert q\rVert_2\lVert x\rVert_2}$LargerAngular similarity matters more than magnitude.

Normalization insight:

  • If vectors are unit-normalized, then $\lVert q-x\rVert_2^2 = 2 - 2s_{cos}(q,x)$.
  • In that case, L2 and COSINE produce identical ranking.

Index Type Quick Reference

Index typeSearch styleStrengthTrade-off
FLATExactMaximum correctness baselineLatency rises fastest as data grows
HNSWApproximate graph ANNStrong latency/recall defaultHigher memory overhead
IVF_FLATApproximate inverted fileGood scale with controllable qualityNeeds tuning for best recall
IVF_SQ8IVF + quantizationBetter memory efficiencyMore approximation loss
SituationStarting configWhy
New project, verify data qualityFLAT + L2Establish trusted baseline before optimization
General production vector searchHNSW + L2Balanced speed and quality
Growing dataset with latency pressureIVF_FLAT + L2Better scale characteristics
Memory-constrained very large deploymentIVF_SQ8 + L2Reduces memory footprint

CLI and RunCommand Navigation

Use direct CLI path for standard operations:

  • dodil vbase collection ...
  • dodil vbase index ...
  • dodil vbase data search ...

Use RunCommand when you need unsupported search/index controls, especially explicit metric behavior beyond first-class CLI defaults.

RunCommand workflow:

Quick Sanity Checklist

  1. Metric intent is clear (L2, IP, or COSINE).
  2. Index type matches scale and latency goals.
  3. Vector dimension in data matches collection schema.
  4. Field names in insert/search match collection definition.
  5. If using first-class CLI search, metric expectation is consistent with L2 behavior.