Dimension
Dimension (sometimes called embedding dimension) is the length of the vector produced for each chunk.
If your embedding has dimension 512, every chunk is represented as a vector of 512 numbers.
Dimension matters because it affects:
- Quality: higher dimensions can capture more nuance (often better retrieval), up to a point.
- Cost & speed: higher dimensions increase storage size, network payloads, and compute.
- Compatibility: your vector database collection/index must be created with the same dimension.
Where Dimension is set
You can set dimension on the job via:
embed_spec.dimension
If omitted, VNG uses the default dimension of the selected model.
Examples
These examples assume you already connected to VNG (see Connect to VNG) and you have a vng client instance.
1) Use the model default dimension
If you don’t set a dimension, VNG uses the model’s default 2048.
vecs = vng.embed(
inputs=["Quickstart guide"],
)
print(len(vecs[0])) # e.g., 20482) Set a common dimension for a knowledge base (e.g., 768)
vecs = vng.embed(
inputs=["Company onboarding policy"],
dim=768,
)
print(len(vecs[0])) # 7683) Smaller vectors to reduce storage (e.g., 256)
vecs = vng.embed(
inputs=["FAQ: billing and invoices"],
dim=256,
)
print(len(vecs[0])) # 2564) Larger vectors for maximum quality (e.g., 1536)
vecs = vng.embed(
inputs=["Technical architecture overview"],
dim=1536,
)
print(len(vecs[0])) # 15365) Important: your VBase collection dimension must match
When you store embeddings in VBase, the collection/index dimension must match the vectors you generate.
A simple rule:
- If you embed with
dim=768, create the VBase collection with dimension 768. - If you change dimension later, create a new collection and re-embed.
Supported dimensions
Our multimodal embedding model supports dimensions in this range:
- 128 → 2048 (inclusive)
If you request a value outside this range, the job may fail validation.
Choosing the right dimension
A practical starting point:
- 512 or 768 for a general-purpose knowledge base
- 128–256 if cost/storage is the top priority (smaller vectors)
- 1024–2048 if you need maximum quality and can afford larger vectors
Important note for your vector database
Your vector index must be created with a fixed dimension.
If you change the dimension later, you’ll need to:
- create a new collection/index with the new dimension, and
- re-embed (re-ingest) your content.