A single-vector search is the most common way to use a vector database: you provide one embedding (your query), and Dodil VBase returns the top-K most similar vectors from a collection.
This is an Approximate Nearest Neighbor (ANN) search by default, meaning VBase relies on your collection’s index to return results fast at scale. The exact speed/recall tradeoff depends on the index type and its parameters.
What you need before searching
- Python 3.10+
pip install dodil- A VBase database and an existing collection with a vector field (for example:
vector)
If you haven’t connected to VBase yet, follow the connection guide first.
Run a single-vector search
In this example, we search the quick_setup collection using one query embedding. We ask for the top 3 results and use Inner Product (IP) as the similarity metric.
from dodil import Client
from dodil.vbase import VBaseConfig
# Authorize with your Dodil service account
c = Client(
service_account_id="...",
service_account_secret="...",
)
# Connect to your VBase database
vbase = c.vbase.connect(
VBaseConfig(
host="vbase-db-<id>.infra.dodil.cloud",
port=443,
scheme="https",
db_name="db_<id>",
)
)
# One query embedding (same dimension as your collection's vector field)
query_vector = [
0.35803764,
-0.6023496,
0.18414013,
-0.26286206,
0.90294385,
]
# Single-vector search (top-K)
results = vbase.search(
collection_name="quick_setup",
anns_field="vector",
data=[query_vector],
limit=3,
search_params={"metric_type": "IP"},
)
# results is typically a list (one entry per input vector)
for hits in results:
for hit in hits:
# Common fields you will see:
# - id: primary key of the matched entity
# - distance/score: similarity value (meaning depends on metric)
# - entity: selected output fields (if requested)
print(hit)Understanding the response
You’ll usually get one list of hits per input vector. Since this is a single-vector search (we passed a list containing one vector), you’ll receive a list with one “hits” list.
Each hit contains:
id: the primary key of the matched entitydistance/score: the similarity value- With IP and COSINE, higher typically means “more similar”.
- With L2, lower means “closer / more similar”.
entity: the extra fields you requested viaoutput_fields(optional)
Practical tips
- Metric must match your index and data. If your collection was indexed with a specific metric, search using the same metric.
- Top-K controls cost. Larger
limitmeans more work. Start small (e.g., 5–20) and increase only if needed. - Use filters when you can. If your schema has scalar fields (like
tenant_id,category,created_at), adding a filter can improve both relevance and performance. - Index + load matters. For large collections, make sure you have an index built and the collection is loaded (see Load and Release). Without a suitable index, searches can be slow.