dodil k3 table — maintenance
Operational table maintenance. Three commands — optimize packs small Delta files, vacuum reclaims space past retention, compact drains the write log into Delta on demand. K3 runs the log → Delta drain automatically in the background; you’ll reach for these for explicit packing, space reclamation, and post-batch sequencing.
Persistent flag on the group: --bucket / -b (required).
CLI doesn’t yet cover
Restore(time travel) orHistory(commit log) — both live on the Maintenance API. See Recipes → Time travel + Restore for the workflow.
dodil k3 table optimize
dodil k3 table optimize [name] -b BUCKET [--target-file-size-mb N] [--z-order-column COL ...]Two modes:
- Bin-pack (default) — coalesce small Delta files up to
--target-file-size-mbper partition. - Z-order — set
--z-order-columnto rewrite all files clustered by those columns. Improves read locality for queries that filter on them.
| Flag | Type | Default | Description |
|---|---|---|---|
--target-file-size-mb | int64 | 128 | Target file size after bin-pack |
--z-order-column | string list (repeat) | [] | Columns to cluster by. Setting any value switches mode to z_order. |
# Bin-pack with default 128 MB target
dodil k3 table optimize events --bucket kb-prod
# Larger target (256 MB) for analytical workloads
dodil k3 table optimize events --bucket kb-prod --target-file-size-mb 256
# Z-order for queries that filter on (event_type, user_id)
dodil k3 table optimize events --bucket kb-prod \
--z-order-column event_type --z-order-column user_idReading the response:
| Field | Meaning |
|---|---|
bytesRemoved - bytesAdded | Compaction savings — typically positive on small-files-heavy tables |
totalFilesSkipped / totalConsideredFiles | Skip ratio — high = already well-packed |
partitionsOptimized | 0 = no-op |
When to call: after large bulk writes / nightly batches / migrations. Routinely-compacted HTAP tables stay well-packed automatically.
dodil k3 table vacuum
dodil k3 table vacuum [name] -b BUCKET [--retention-hours N] [--dry-run] [--disable-retention-check]Permanently delete old Delta file versions past retention. Destructive — vacuumed files are gone, and the corresponding Delta versions can no longer be restored.
| Flag | Type | Default | Description |
|---|---|---|---|
--retention-hours | int64 | 168 (7 days) | Retention window |
--dry-run | bool | false | Preview files that would be deleted |
--disable-retention-check | bool | false | Bypass Delta’s safety net (in-flight readers / time-travel break — operator scripts only) |
# ALWAYS dry-run first
dodil k3 table vacuum events --bucket kb-prod --dry-run -o json \
| jq '{filesDeleted, filesDeletedPaths: .filesDeletedPaths[0:5]}'
# Real run after reviewing
dodil k3 table vacuum events --bucket kb-prod
# Custom retention
dodil k3 table vacuum events --bucket kb-prod --retention-hours 720 # 30 days
--disable-retention-checkbypasses Delta’s safety net. Delta refuses to vacuum files younger than 168 h because in-flight readers and time-travel queries break once the files are gone. Set only for forced cleanups on quiesced tables where you accept that risk.
Vacuum interacts with Restore — once you vacuum past version N, restore to version < N is no longer possible. The safe pattern is documented at Recipes → Time travel + Restore.
dodil k3 table compact
dodil k3 table compact [name] -b BUCKET [--batch-size N]Force the write log to drain into Delta now. K3 runs this in the background automatically; call it manually for:
- After bulk writes — materialize rows in Delta immediately for analytical reads
- Before maintenance —
compact→optimizeis the canonical sequence post-batch - Tests / e2e — drain the log to assert against
describe’slastDrain*counters
| Flag | Type | Default | Description |
|---|---|---|---|
--batch-size | int64 | 10000 | Max write-log entries processed per tick |
# Single drain tick
dodil k3 table compact events --bucket kb-prod
# Loop until fully drained — `truncated: false` means done
while dodil k3 table compact events --bucket kb-prod -o json \
| jq -e '.truncated == true' > /dev/null; do
echo " more entries to drain..."
doneResponse fields:
| Field | Meaning |
|---|---|
walEntriesProcessed | Total log entries read this tick |
walUniqueKeys | Distinct PKs after newest-per-key dedup |
rowsMerged | Rows MERGEd into Delta |
rowsRejected | Rejected by schema validation |
tombstonesSeen | DELETE tombstones encountered |
truncated | true if --batch-size was hit — re-run |
lastDrainTargetVersion | Delta version after the drain |
walUniqueKeys < walEntriesProcessed is normal — the compactor dedupes multiple writes to the same PK (newest-wins) before MERGE.
Canonical post-batch sequence
After a big bulk write:
# 1. Drain the write log into Delta
while dodil k3 table compact events --bucket kb-prod -o json \
| jq -e '.truncated == true' > /dev/null; do :; done
# 2. Bin-pack the resulting small files
dodil k3 table optimize events --bucket kb-prod
# 3. (Optional) Periodically — vacuum old versions past retention
dodil k3 table vacuum events --bucket kb-prod --dry-run # always preview first
dodil k3 table vacuum events --bucket kb-prodSee also
- Maintenance — API Reference — full surface incl.
RestoreandHistory dodil k3 table— lifecycle —describeto see drain-lag + last-drain statsdodil k3 table— data — what produces write-log entries in the first place- Recipes → Time travel + Restore — vacuum / restore interaction