Skip to Content
We are live but in Staging 🎉
Object StorageCore Concepts

Core Concepts — Object Storage

The Object Storage domain is defined by the proto package dodil.k3.storage.v1. From the proto itself:

Owns: buckets, bucket policy (S3-style ACL), CORS, and the S3-like admin surface for objects (list, info, delete, presigned URL). Does NOT own: sources or their credentials (see dodil.k3.source.v1), rules / sync / ingest jobs (see dodil.k3.ingest.v1), or the pillar engines (vector / warehouse). Storage stays thin: the data plane, not the intelligence plane.

Each entity below has a gRPC view (the proto message) and an HTTP view (the JSON your client sees on the wire). Use the toggle at the top of each section to flip — your choice sticks across the page and across visits.

What to expect in the HTTP view:

  • Field names are camelCase (e.g. storageUsedBytes).
  • Enums are wire-name strings (e.g. "BUCKET_STATUS_ACTIVE").
  • int64 values are JSON strings — JavaScript can’t represent them as numbers without precision loss.
  • Every field is always present — defaults are not omitted.
  • Timestamps are Unix milliseconds.

Bucket

The central entity. Most K3 APIs are bucket-scoped — sources, rules, vector engines, and table engines all attach here.

{ "name": "demo", "description": "Workflow bucket", "status": "BUCKET_STATUS_ACTIVE", "internalSourceId": "src_01HEFGHIJK", "storageUsedBytes": "1048576", "storageQuotaBytes": "0", "objectCount": "42", "sourceCount": "3", "ruleCount": "2", "indexCount": "1", "createdAt": "1700000000000", "updatedAt": "1700001000000", "accessMode": "BUCKET_ACCESS_MODE_PRIVATE" }

Notes:

  • internal_source_id references an auto-created INTERNAL_S3 source managed by the Source service — a direct S3 PUT triggers the same ingest rules as an external sync.
  • The *_count and storage_used_bytes fields are server-maintained aggregates.

Object

Type: ObjectInfo. The metadata view returned by the admin surface (ListObjects, GetObjectInfo). The bytes themselves flow through the K3 HTTP gateway, not the gRPC API.

{ "bucket": "demo", "key": "docs/sample.pdf", "size": "1048576", "etag": "\"d41d8cd98f00b204e9800998ecf8427e\"", "contentType": "application/pdf", "lastModified": "1700000500000", "userMetadata": { "source": "drive" }, "storageClass": "STANDARD", "indexStatus": "indexed", "pipelineStatuses": [ { "ruleId": "rule_01HEFG", "ruleName": "docs-rule", "pipelineKind": "vector", "indexStatus": "indexed", "destinationId": "col_01HIJK" } ] }

Notes:

  • pipeline_statuses is the bridge between the storage plane and the ingestion plane: each entry tells you which rule fired for this object, what kind of pipeline ran, and the current index state.
  • pipeline_kind is a flat string ("vector" / "warehouse" / "free") so callers can stay decoupled from the ingest service types.

Policy

Type: BucketPolicy. S3-style access control, modeled on the AWS bucket-policy contract. Bound to a bucket via SetBucketPolicy; BucketAccessMode.CUSTOM on Bucket signals a policy is in effect.

{ "version": "2024-01-01", "statements": [ { "sid": "PublicRead", "effect": "POLICY_EFFECT_ALLOW", "principal": { "aws": ["*"] }, "actions": ["s3:GetObject"], "resources": ["demo/public/*"] } ] }

Notes:

  • actions use S3 action names (s3:GetObject, s3:PutObject, …); evaluation follows S3 semantics.
  • principal.aws accepts "*" for anonymous access or one or more org IDs for tenant-scoped policies.

CORS

Type: BucketCorsConfiguration. Mirrors S3 CORS one-to-one.

{ "corsRules": [ { "allowedOrigins": ["https://app.example.com"], "allowedMethods": ["GET", "PUT"], "allowedHeaders": ["*"], "exposeHeaders": ["ETag"], "maxAgeSeconds": 3600 } ] }

PresignedURL

Produced by the GetObjectUrl RPC. The URL carries a K3 custom token signed with the server’s presign_secret (impl: services/storage/objects.rs). It is not an AWS SigV4 or SigV2 URL.

// Request body (POST /{bucket}/objects/url) { "bucket": "demo", "key": "docs/sample.pdf", "expiresInSeconds": "1800" } // Response { "url": "https://k3.dev.dodil.io/demo/objects/docs%2Fsample.pdf?token=eyJ...", "expiresAt": "1700003600000" }

Notes:

  • expires_in_seconds defaults to 3600 (1h) and is capped at 86400 (24h) by the handler.
  • AWS SigV4 / SigV2 are separately accepted by the S3 proxy as incoming auth schemes — those are client-signed URLs. GetObjectUrl is the server-issued, K3-token flow.

See also