Skip to Content
We are live but in Staging 🎉

Overview

In VBase, there are two common ways to retrieve entities without doing a vector similarity search:

  • Get: fetch entities by their primary key values (fast lookup when you already know the IDs).
  • Scalar Query: fetch entities by filtering on non-vector fields (numbers, strings, booleans, timestamps, etc.).

Both methods return the matching entities, and you can choose which fields to return via output_fields.

Get entities by primary key

Use Get when you already have the primary keys and just want to fetch the full entities.

Typical use cases:

  • hydrate an API response after you stored only IDs
  • fetch a batch of entities for evaluation/debugging
  • re-check the stored metadata for a known set of IDs
res = vbase.get( collection_name="my_collection", ids=[10, 11, 12], output_fields=["color", "vector"], ) for row in res: print(row)

Get from a specific partition

If you’re using partitions, you can restrict the lookup:

res = vbase.get( collection_name="my_collection", partition_names=["partitionA"], ids=[10, 11, 12], output_fields=["color"], )

Scalar Query (filter by non-vector fields)

Use Query when you want to select entities by conditions on scalar fields.

For example, assume your collection has fields like:

  • id (primary key)
  • vector (embedding)
  • color (string)
  • price (number)
  • in_stock (boolean)

Basic query

The example below returns up to 3 entities where the color field starts with red.

res = vbase.query( collection_name="my_collection", filter='color like "red%"', output_fields=["id", "color", "vector"], limit=3, ) for row in res: print(row)

Common filter patterns

These expressions are written as strings.

# Equality filter='color == "blue"' # Boolean filter='in_stock == true' # Ranges filter='price >= 10 and price < 50' # Set membership filter='color in ["red", "green", "blue"]' # Prefix match filter='color like "red%"' # Combine conditions filter='in_stock == true and price <= 100'

Query within partitions

res = vbase.query( collection_name="my_collection", partition_names=["partitionA"], filter='color like "red%"', output_fields=["id", "color"], limit=100, )

Paging large result sets (Query Iterator)

If your query may return many rows, iterating in batches is safer than requesting everything at once.

A query iterator pattern:

it = vbase.query_iterator( collection_name="my_collection", filter='color like "red%"', output_fields=["id", "color"], batch_size=1000, ) try: for batch in it: for row in batch: # process row pass finally: it.close()

Tip: Use an iterator for backfills, exports, audits, or any workflow where you don’t know the result size upfront.

Random sampling with Query

To fetch a representative subset of a collection (for exploration, evaluation, or quick tests), you can sample during a query:

# Sample ~1% of the collection res = vbase.query( collection_name="my_collection", filter="RANDOM_SAMPLE(0.01)", output_fields=["id", "color"], ) print("sample size:", len(res))

You can also combine sampling with other filters:

res = vbase.query( collection_name="my_collection", filter='color like "red%" and RANDOM_SAMPLE(0.005)', output_fields=["id", "color"], limit=10, )

Get vs Query

  • Use Get when you already know the IDs.
  • Use Query when you need filtering rules.
  • Use Query Iterator when the result size can be large.

If you need nearest-neighbor similarity search, use a vector search (covered in the Search docs).

Last updated on