Overview
In VBase, there are two common ways to retrieve entities without doing a vector similarity search:
- Get: fetch entities by their primary key values (fast lookup when you already know the IDs).
- Scalar Query: fetch entities by filtering on non-vector fields (numbers, strings, booleans, timestamps, etc.).
Both methods return the matching entities, and you can choose which fields to return via output_fields.
Get entities by primary key
Use Get when you already have the primary keys and just want to fetch the full entities.
Typical use cases:
- hydrate an API response after you stored only IDs
- fetch a batch of entities for evaluation/debugging
- re-check the stored metadata for a known set of IDs
res = vbase.get(
collection_name="my_collection",
ids=[10, 11, 12],
output_fields=["color", "vector"],
)
for row in res:
print(row)Get from a specific partition
If you’re using partitions, you can restrict the lookup:
res = vbase.get(
collection_name="my_collection",
partition_names=["partitionA"],
ids=[10, 11, 12],
output_fields=["color"],
)Scalar Query (filter by non-vector fields)
Use Query when you want to select entities by conditions on scalar fields.
For example, assume your collection has fields like:
id(primary key)vector(embedding)color(string)price(number)in_stock(boolean)
Basic query
The example below returns up to 3 entities where the color field starts with red.
res = vbase.query(
collection_name="my_collection",
filter='color like "red%"',
output_fields=["id", "color", "vector"],
limit=3,
)
for row in res:
print(row)Common filter patterns
These expressions are written as strings.
# Equality
filter='color == "blue"'
# Boolean
filter='in_stock == true'
# Ranges
filter='price >= 10 and price < 50'
# Set membership
filter='color in ["red", "green", "blue"]'
# Prefix match
filter='color like "red%"'
# Combine conditions
filter='in_stock == true and price <= 100'Query within partitions
res = vbase.query(
collection_name="my_collection",
partition_names=["partitionA"],
filter='color like "red%"',
output_fields=["id", "color"],
limit=100,
)Paging large result sets (Query Iterator)
If your query may return many rows, iterating in batches is safer than requesting everything at once.
A query iterator pattern:
it = vbase.query_iterator(
collection_name="my_collection",
filter='color like "red%"',
output_fields=["id", "color"],
batch_size=1000,
)
try:
for batch in it:
for row in batch:
# process row
pass
finally:
it.close()Tip: Use an iterator for backfills, exports, audits, or any workflow where you don’t know the result size upfront.
Random sampling with Query
To fetch a representative subset of a collection (for exploration, evaluation, or quick tests), you can sample during a query:
# Sample ~1% of the collection
res = vbase.query(
collection_name="my_collection",
filter="RANDOM_SAMPLE(0.01)",
output_fields=["id", "color"],
)
print("sample size:", len(res))You can also combine sampling with other filters:
res = vbase.query(
collection_name="my_collection",
filter='color like "red%" and RANDOM_SAMPLE(0.005)',
output_fields=["id", "color"],
limit=10,
)Get vs Query
- Use Get when you already know the IDs.
- Use Query when you need filtering rules.
- Use Query Iterator when the result size can be large.
If you need nearest-neighbor similarity search, use a vector search (covered in the Search docs).