Skip to Content
We are live but in Staging 🎉
TablesSQL Compatibility

SQL Compatibility

K3 Tables speak DuckDB-flavored SQL. Anything DuckDB parses — joins, CTEs, window functions, JSON operators, set ops, subqueries — K3 can plan and dispatch. There are a few documented refusals (transactions, multi-statement), and the planner picks a strategy per statement based on the table’s primary key, partitioning, and the query shape.

This page is the SQL contract: what dialect, what types, what statement shapes are supported, what’s not, and how to connect a native client.

Dialect

Base SQL dialectDuckDB SQL
Driven byExecute RPC (POST /:bucket/tables/_execute) — sends one SQL string, returns structured results
Where DuckDB documents the dialectduckdb.org/docs/sql 
K3-specific divergencesSee Refusals below — the rest is DuckDB

Why DuckDB: it’s a single, well-documented, embeddable SQL parser + planner with strong analytical-workload focus. K3 hands DuckDB the SQL, gets a parsed plan back, then routes the operation across the write log + Delta tiers via its own dispatcher. You write standard DuckDB SQL; K3 worries about the tier routing.

Column types

Eleven column types — all are real DuckDB types under the hood, all serialize losslessly to Delta:

K3 typeDuckDB typeWire shape on Insert / Merge
stringVARCHARJSON string
intINTEGER (int32)JSON number
longBIGINT (int64)JSON number or string (large values)
shortSMALLINT (int16)JSON number
floatFLOAT (32-bit)JSON number
doubleDOUBLE (64-bit)JSON number
booleanBOOLEANJSON boolean
dateDATEJSON string "YYYY-MM-DD"
timestampTIMESTAMP (microseconds UTC)JSON number (µs since epoch) or "YYYY-MM-DD HH:MM:SS"
binaryBLOBbase64 string
jsonVARCHAR (with Arrow JSON extension tag)JSON value (object / array / scalar)

long values that exceed safe-integer range (±2^53 − 1) — pass as a JSON string to preserve precision. K3 stores int64 natively.

default_expression

Every column may carry a default expression — literal SQL applied at row-commit time when the column is absent from the input row. Validated at table creation:

Default expressionEffect
"GENERATE_UUID()"UUID string per row
"CURRENT_TIMESTAMP()"Insert-time timestamp
"ULID()"Lexicographically-sortable ID
"42" / "'pending'" / "true"Literal constants
"" (empty)Column required at write time unless nullable = true

Statement shapes — supported

What you can put in an Execute SQL string:

DDL

-- Create a table with a composite primary key + partition column + JSON column CREATE TABLE events ( id BIGINT NOT NULL, user_id VARCHAR NOT NULL, occurred_at TIMESTAMP NOT NULL, event_type VARCHAR NOT NULL, payload JSON, PRIMARY KEY (id, user_id) ) PARTITIONED BY (event_type); -- Add a new column ALTER TABLE events ADD COLUMN session_id VARCHAR; -- Drop a table DROP TABLE events; DROP TABLE IF EXISTS events; -- CTAS (CREATE TABLE AS SELECT) — schema inferred from the SELECT CREATE TABLE click_summary AS SELECT user_id, COUNT(*) AS n FROM events WHERE event_type = 'click' GROUP BY user_id;

Notes:

  • PRIMARY KEY is honored — inline column-level (id BIGINT PRIMARY KEY) and table-level (PRIMARY KEY (id, user_id)) both work. The key flows to the planner so subsequent writes get keyed routing for free.
  • PARTITIONED BY (col, …) controls Delta partitioning. Reads that filter on a partition column can prune entire partitions.
  • CTAS infers types from the SELECT — int-shaped JSON numbers → long, fractional → double, strings → string, bools → boolean. For typed migrations, pass explicit columns instead via the structured CreateTable RPC.

DML — writes

-- Single-row insert (keyed) INSERT INTO events (id, user_id, occurred_at, event_type, payload) VALUES (1, 'u-101', TIMESTAMP '2026-05-27 10:00:00', 'click', '{"page":"/pricing"}'); -- Bulk insert (keyed) INSERT INTO events (id, user_id, occurred_at, event_type, payload) VALUES (1, 'u-101', TIMESTAMP '2026-05-27 10:00:00', 'click', '{"page":"/pricing"}'), (2, 'u-101', TIMESTAMP '2026-05-27 10:01:00', 'click', '{"page":"/signup"}'), (3, 'u-102', TIMESTAMP '2026-05-27 10:02:00', 'purchase', '{"sku":"A-12","amount":49.99}'); -- INSERT … SELECT (keyed-from-select) INSERT INTO click_summary SELECT user_id, COUNT(*) FROM events WHERE event_type = 'click' GROUP BY user_id; -- UPDATE via PK (keyed → write log) UPDATE events SET event_type = 'click_pricing' WHERE id = 1 AND user_id = 'u-101'; -- UPDATE via non-PK predicate (non-keyed → ⚠️ Delta-only, see warnings) UPDATE events SET event_type = 'archived' WHERE occurred_at < TIMESTAMP '2025-01-01 00:00:00'; -- DELETE via PK (keyed) DELETE FROM events WHERE id = 1 AND user_id = 'u-101'; -- DELETE via non-PK predicate (non-keyed → ⚠️) DELETE FROM events WHERE occurred_at < TIMESTAMP '2025-01-01 00:00:00'; -- MERGE (upsert) — source is inline rows / a query / another table MERGE INTO events AS t USING (SELECT * FROM events_staging) AS s ON t.id = s.id AND t.user_id = s.user_id WHEN MATCHED THEN UPDATE SET event_type = s.event_type, payload = s.payload WHEN NOT MATCHED THEN INSERT (id, user_id, occurred_at, event_type, payload) VALUES (s.id, s.user_id, s.occurred_at, s.event_type, s.payload);

Routing rule of thumb: predicates / ON clauses that match the table’s merge_keys route through the write log (the planner picks a KEYED_* strategy). Predicates that don’t route directly to Delta — fast but bypass the log, which carries the WAL-overwrite risk discussed in Execute → UPDATE.

DML — reads

-- Simple SELECT SELECT * FROM events WHERE event_type = 'click' LIMIT 100; -- Aggregates (federated across partitions when applicable) SELECT user_id, COUNT(*) AS n FROM events GROUP BY user_id ORDER BY n DESC; -- Joins SELECT e.user_id, u.email, COUNT(*) AS clicks FROM events e JOIN users u ON e.user_id = u.id WHERE e.event_type = 'click' GROUP BY e.user_id, u.email; -- Window functions SELECT user_id, occurred_at, LAG(occurred_at) OVER (PARTITION BY user_id ORDER BY occurred_at) AS prev_event FROM events; -- CTEs (WITH …) WITH purchases AS ( SELECT user_id, payload->>'sku' AS sku, (payload->>'amount')::double AS amount FROM events WHERE event_type = 'purchase' ) SELECT sku, SUM(amount) AS total FROM purchases GROUP BY sku; -- Subqueries SELECT * FROM events WHERE user_id IN (SELECT id FROM users WHERE tier = 'pro'); -- Set ops (SELECT user_id FROM events WHERE event_type = 'click') INTERSECT (SELECT user_id FROM events WHERE event_type = 'purchase');

All standard DuckDB read shapes work. The planner picks UNARY_WAREHOUSE, UNARY_MERGED_STRONG, or FEDERATED_* depending on freshness, partitioning, and aggregate-vs-scan shape — see Execute → SELECT.

Read freshness

For SELECT plans, you control freshness via the freshness field on ExecuteRequest — not in the SQL itself:

ModeWhat it doesWhen to use
EVENTUAL (default)Delta-only scan; planner may distribute across partitionsAnalytical reads; OLAP workloads. Fast but misses uncompacted writes.
STRONGSingle-pod merge of write log + DeltaRead-your-writes. Pays log-listing cost; scope tightly.
OLTP_ONLYWrite log only (no Delta scan)Niche — point lookups during heavy un-drained backlog. Most callers want STRONG.

JSON columns

K3’s json column is a first-class semantic type. Stored as canonical JSON text under the hood, tagged so DuckDB recognizes it. Every DuckDB JSON operator works:

-- Field access SELECT payload->'sku' FROM events; -- returns JSON SELECT payload->>'sku' FROM events; -- returns TEXT -- JSON path SELECT json_extract(payload, '$.amount') FROM events; SELECT json_extract_string(payload, '$.sku') FROM events; -- Iterate over array elements / object entries SELECT json_each(payload) FROM events; -- Cast extracted scalars SELECT (payload->>'amount')::double AS amount FROM events; -- Round-trip a Postgres-style cast SELECT payload::json->'tags'->>0 FROM events;

For high-volume read paths, promote hot fields to typed columns; reserve json for the long tail / unstructured metadata.

Refusals

The following are valid DuckDB SQL but K3 refuses them:

RefusedWhyWhat to do instead
BEGIN / COMMIT / ROLLBACKNo transaction coordinator in K3 todayFor transactional workloads requiring multi-statement ACID, use Postgres. Individual Execute calls are atomic against Delta.
Multi-statement bodies (stmt1; stmt2;)One SQL per Execute callChain on the caller side — send one statement at a time.
ALTER TABLE DROP COLUMNNot implemented (parsed, then refused at planning)Recreate the table without the column (via CREATE TABLE AS SELECT) or leave the column nullable.
ALTER TABLE RENAME COLUMNNot implemented (parsed, then refused)Same as above — recreate via CTAS with the new name.

Multiple ADD COLUMNs in a single ALTER TABLE are supported.

Native clients

Two ways to send SQL to K3 today — both speak the same dialect:

HTTP — Execute RPC

curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/tables/_execute" \ -H "Authorization: Bearer $DODIL_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "bucket": "kb-prod", "sql": "SELECT event_type, COUNT(*) AS n FROM events GROUP BY event_type", "freshness": "FRESHNESS_EVENTUAL" }'

Response is a oneofquery / write / ddl — depending on the SQL kind. See API Reference → Execute.

CLI — dodil k3 table query

# Default freshness (eventual) dodil k3 table query --bucket kb-prod --sql " SELECT event_type, COUNT(*) AS n FROM events GROUP BY event_type " # Read-your-writes dodil k3 table query --bucket kb-prod --sql " SELECT * FROM events WHERE user_id = 'u-101' " --freshness strong # JSON output for scripting dodil k3 table query --bucket kb-prod --sql "SELECT * FROM events LIMIT 5" -o json

The CLI’s table query command handles both reads and writes — it picks the right RPC based on what the planner returns.

Structured shortcuts (no SQL string)

If you’d rather not assemble SQL strings, the typed RPCs (Insert / Merge / Update / DeleteRows / Query) and their CLI commands (dodil k3 table insert / merge / update / delete-rows / query) accept JSON rows + predicates / match-columns. The planner routes them through the same strategies — they’re just shortcuts for callers that prefer typed inputs to SQL templating. See API Reference → Data.

Coming-from-X cheatsheet

Coming fromK3 equivalent
Postgres INSERT ... ON CONFLICT (pk) DO UPDATEUse MERGE INTO ... USING ... ON t.pk = s.pk WHEN MATCHED THEN UPDATE ... WHEN NOT MATCHED THEN INSERT ...
Postgres transactionsNot supported — wrap retry / idempotency in your app layer
MySQL INSERT ... ON DUPLICATE KEY UPDATESame — use MERGE
BigQuery CREATE TABLE AS SELECTSame — CREATE TABLE … AS SELECT …
BigQuery MERGEIdentical syntax
DuckDB itselfIdentical syntax; K3 adds the HTAP routing layer
Spark SQL / Iceberg MERGESimilar — MERGE INTO ... USING ... WHEN MATCHED ... WHEN NOT MATCHED ...
Delta Lake external toolsTables are real Delta tables — external Delta-capable engines can read them by pointing at the bucket’s Delta directory. Writes are the planner’s job; don’t write to the Delta dir externally.

See also