Every collection in Dodil VBase needs a primary field — a single column that uniquely identifies every entity (row). Think of it like the primary key in a traditional database.
The primary field is used whenever you insert, upsert, query, or delete entities. Without a unique primary key, Dodil can’t reliably target a specific record.
What a primary field means in VBase
A primary field is:
- Exactly one per collection
- Non-nullable (every entity must have a value)
- Immutable in type (you choose the type when creating the collection and you can’t change it later)
Supported primary field types
VBase supports the same common primary-key types you’ll use in most systems:
- INT64 (recommended): fast, compact, and perfect for AutoID.
- VARCHAR: best when you already have IDs from another system (e.g.,
user_123, SKU codes, document IDs).
If you use a string primary key, you’ll also set a maximum length for the field (covered in the schema docs).
AutoID vs manual IDs
You have two options when creating a collection:
Option A: AutoID (recommended)
With AutoID, VBase generates the primary key values for you. This is the simplest choice and avoids accidental ID collisions.
Use AutoID when:
- You don’t need IDs to match another system
- You want the easiest ingestion experience
- You expect very high write throughput
Option B: Manual IDs
With manual IDs, you provide the primary key in every insert/upsert.
Use manual IDs when:
- You already have stable IDs in your product (users, documents, SKUs, etc.)
- You want to re-ingest data and keep the exact same identifiers
- You need deterministic mapping between your database and your vector entities
Example: Create a collection with AutoID
Below is a minimal example using the Dodil SDK (Python 3.10+):
from dodil import Client
from dodil.vbase import VBaseConfig
# Authorize with a service account
c = Client(
service_account_id="...",
service_account_secret="...",
)
vbase = c.vbase.connect(
VBaseConfig(
host="vbase-db-<id>.infra.dodil.cloud",
port=443,
scheme="https",
db_name="db_<id>",
)
)
# AutoID means you don't send `id` in inserts.
vbase.create_collection(
collection_name="docs_demo",
dimension=1536,
primary_field_name="id",
id_type="int", # INT64 primary key
vector_field_name="vector",
metric_type="COSINE",
auto_id=True,
)
print(vbase.list_collections())Example: Insert with manual IDs
If auto_id=False, you provide the id for each entity:
vbase.create_collection(
collection_name="docs_manual_ids",
dimension=1536,
primary_field_name="id",
id_type="int",
vector_field_name="vector",
metric_type="COSINE",
auto_id=False,
)
vbase.insert(
collection_name="docs_manual_ids",
data=[
{"id": 1, "vector": [0.1] * 1536, "source": "page_1"},
{"id": 2, "vector": [0.2] * 1536, "source": "page_2"},
],
)Practical tips
- If you’re unsure, start with INT64 + AutoID.
- If you need stable IDs across systems, use manual IDs.
- If you realize later you chose the wrong primary field type or AutoID mode, you’ll typically create a new collection and re-ingest (because primary field settings can’t be changed after creation).